Class BaseUTF8Encoding
java.lang.Object
org.jcodings.Encoding
org.jcodings.AbstractEncoding
org.jcodings.MultiByteEncoding
org.jcodings.unicode.UnicodeEncoding
org.jcodings.specific.BaseUTF8Encoding
- All Implemented Interfaces:
Cloneable
- Direct Known Subclasses:
NonStrictUTF8Encoding, UTF8Encoding
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final intprivate static final int(package private) static final booleanprivate static final int -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionintcodeToMbc(int code, byte[] bytes, int p) Extracts code point into it's multibyte representationintcodeToMbcLength(int code) Returns character length given a code point Oniguruma equivalent:code_to_mbclenint[]ctypeCodeRange(int ctype, IntHolder sbOut) utf8_get_ctype_code_rangeThe name of the equivalent Java Charset for this encoding.booleanisNewLine(byte[] bytes, int p, int end) onigenc_is_mbc_newline_0x0a / used also by multibyte encodingsbooleanisReverseMatchAllowed(byte[] bytes, int p, int end) onigenc_always_true_is_allowed_reverse_matchintleftAdjustCharHead(byte[] bytes, int p, int s, int end) utf8_left_adjust_char_headintmbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] fold) onigenc_ascii_mbc_case_foldintmbcToCode(byte[] bytes, int p, int end) Returns code point for a character Oniguruma equivalent:mbc_to_code(package private) static bytetrail0(int code) (package private) static bytetrailS(int code, int shift) private static booleanutf8IsLead(int c) Methods inherited from class UnicodeEncoding
applyAllCaseFold, caseFoldCodesByString, caseMap, ctypeCodeRange, isCodeCType, isInCodeRange, propertyNameToCTypeMethods inherited from class MultiByteEncoding
isInRange, length, lengthForTwoUptoFour, mb2CodeToMbc, mb2CodeToMbcLength, mb2IsCodeCType, mb4CodeToMbc, mb4CodeToMbcLength, mb4IsCodeCType, mbnMbcCaseFold, mbnMbcToCode, missing, missing, safeLengthForUptoFour, safeLengthForUptoThree, safeLengthForUptoTwo, strCodeAt, strLengthMethods inherited from class AbstractEncoding
asciiApplyAllCaseFold, asciiCaseFoldCodesByString, asciiMbcCaseFold, isCodeCTypeInternalMethods inherited from class Encoding
asciiToLower, asciiToUpper, digitVal, equals, getCharset, getIndex, getName, hashCode, isAlnum, isAlpha, isAscii, isAscii, isAsciiCompatible, isBlank, isCntrl, isDigit, isDummy, isFixedWidth, isGraph, isLower, isMbcAscii, isMbcCrnl, isMbcHead, isMbcWord, isNewLine, isPrint, isPunct, isSbWord, isSingleByte, isSpace, isUnicode, isUpper, isUTF8, isWord, isWordGraphPrint, isXDigit, length, load, load, maxLength, maxLengthDistance, mbcodeStartPosition, minLength, odigitVal, prevCharHead, rightAdjustCharHead, rightAdjustCharHeadWithPrev, setDummy, setName, setName, step, stepBack, strByteLengthNull, strLengthNull, strNCmp, toLowerCaseTable, toString, xdigitVal
-
Field Details
-
USE_INVALID_CODE_SCHEME
static final boolean USE_INVALID_CODE_SCHEME- See Also:
-
INVALID_CODE_FE
private static final int INVALID_CODE_FE- See Also:
-
INVALID_CODE_FF
private static final int INVALID_CODE_FF- See Also:
-
VALID_CODE_LIMIT
private static final int VALID_CODE_LIMIT- See Also:
-
-
Constructor Details
-
BaseUTF8Encoding
protected BaseUTF8Encoding(int[] EncLen, int[][] Trans)
-
-
Method Details
-
getCharsetName
Description copied from class:EncodingThe name of the equivalent Java Charset for this encoding. Defaults to the name of the encoding. Subclasses can override this to provide a different name.- Overrides:
getCharsetNamein classUnicodeEncoding- Returns:
- the name of the equivalent Java Charset for this encoding
-
isNewLine
public boolean isNewLine(byte[] bytes, int p, int end) Description copied from class:AbstractEncodingonigenc_is_mbc_newline_0x0a / used also by multibyte encodings- Overrides:
isNewLinein classAbstractEncoding
-
codeToMbcLength
public int codeToMbcLength(int code) Description copied from class:EncodingReturns character length given a code point Oniguruma equivalent:code_to_mbclen- Specified by:
codeToMbcLengthin classEncoding
-
mbcToCode
-
trailS
static byte trailS(int code, int shift) -
trail0
static byte trail0(int code) -
codeToMbc
-
mbcCaseFold
Description copied from class:AbstractEncodingonigenc_ascii_mbc_case_fold- Overrides:
mbcCaseFoldin classUnicodeEncoding- Parameters:
flag- case fold flagpp- anIntHolderthat points at character headfold- a buffer where to extract case folded character Oniguruma equivalent:mbc_case_fold
-
ctypeCodeRange
utf8_get_ctype_code_range- Specified by:
ctypeCodeRangein classEncoding
-
utf8IsLead
private static boolean utf8IsLead(int c) -
leftAdjustCharHead
public int leftAdjustCharHead(byte[] bytes, int p, int s, int end) utf8_left_adjust_char_head- Specified by:
leftAdjustCharHeadin classEncoding- Parameters:
bytes- byte streamp- positions- stopend- end
-
isReverseMatchAllowed
public boolean isReverseMatchAllowed(byte[] bytes, int p, int end) onigenc_always_true_is_allowed_reverse_match- Specified by:
isReverseMatchAllowedin classEncoding
-