Class CESU8Encoding
java.lang.Object
org.jcodings.Encoding
org.jcodings.AbstractEncoding
org.jcodings.MultiByteEncoding
org.jcodings.unicode.UnicodeEncoding
org.jcodings.specific.CESU8Encoding
- All Implemented Interfaces:
Cloneable
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final int[](package private) static final int[][]static final CESU8Encodingprivate static final intprivate static final int(package private) static final booleanprivate static final int -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionintcodeToMbc(int code, byte[] bytes, int p) Extracts code point into it's multibyte representationintcodeToMbcLength(int code) Returns character length given a code point Oniguruma equivalent:code_to_mbclenint[]ctypeCodeRange(int ctype, IntHolder sbOut) Returns code range for a given character type Oniguruma equivalent:get_ctype_code_rangeThe name of the equivalent Java Charset for this encoding.booleanisNewLine(byte[] bytes, int p, int end) onigenc_is_mbc_newline_0x0a / used also by multibyte encodingsbooleanisReverseMatchAllowed(byte[] bytes, int p, int end) Returns true if it's safe to use reversal Boyer-Moore search fail fast algorithm Oniguruma equivalent:is_allowed_reverse_matchintleftAdjustCharHead(byte[] bytes, int p, int s, int end) Seeks the previous character head in a stream Oniguruma equivalent:left_adjust_char_headintlength(byte[] bytes, int p, int end) Returns character length given stream, character position and stream end returns1for singlebyte encodings or performs sanity validations for multibyte ones and returns the character length, missing characters in the stream otherwiseprivate intlengthForOneUptoSix(byte[] bytes, int p, int end, int b, int s) intmbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] fold) onigenc_ascii_mbc_case_foldintmbcToCode(byte[] bytes, int p, int end) Returns code point for a character Oniguruma equivalent:mbc_to_code(package private) static bytetrail0(int code) (package private) static bytetrail0(long code) (package private) static bytetrailS(int code, int shift) (package private) static bytetrailS(long code, int shift) private static booleanutf8IsLead(int c) Methods inherited from class UnicodeEncoding
applyAllCaseFold, caseFoldCodesByString, caseMap, ctypeCodeRange, isCodeCType, isInCodeRange, propertyNameToCTypeMethods inherited from class MultiByteEncoding
isInRange, length, lengthForTwoUptoFour, mb2CodeToMbc, mb2CodeToMbcLength, mb2IsCodeCType, mb4CodeToMbc, mb4CodeToMbcLength, mb4IsCodeCType, mbnMbcCaseFold, mbnMbcToCode, missing, missing, safeLengthForUptoFour, safeLengthForUptoThree, safeLengthForUptoTwo, strCodeAt, strLengthMethods inherited from class AbstractEncoding
asciiApplyAllCaseFold, asciiCaseFoldCodesByString, asciiMbcCaseFold, isCodeCTypeInternalMethods inherited from class Encoding
asciiToLower, asciiToUpper, digitVal, equals, getCharset, getIndex, getName, hashCode, isAlnum, isAlpha, isAscii, isAscii, isAsciiCompatible, isBlank, isCntrl, isDigit, isDummy, isFixedWidth, isGraph, isLower, isMbcAscii, isMbcCrnl, isMbcHead, isMbcWord, isNewLine, isPrint, isPunct, isSbWord, isSingleByte, isSpace, isUnicode, isUpper, isUTF8, isWord, isWordGraphPrint, isXDigit, load, load, maxLength, maxLengthDistance, mbcodeStartPosition, minLength, odigitVal, prevCharHead, rightAdjustCharHead, rightAdjustCharHeadWithPrev, setDummy, setName, setName, step, stepBack, strByteLengthNull, strLengthNull, strNCmp, toLowerCaseTable, toString, xdigitVal
-
Field Details
-
USE_INVALID_CODE_SCHEME
static final boolean USE_INVALID_CODE_SCHEME- See Also:
-
INVALID_CODE_FE
private static final int INVALID_CODE_FE- See Also:
-
INVALID_CODE_FF
private static final int INVALID_CODE_FF- See Also:
-
VALID_CODE_LIMIT
private static final int VALID_CODE_LIMIT- See Also:
-
CESU8EncLen
private static final int[] CESU8EncLen -
CESU8Trans
static final int[][] CESU8Trans -
INSTANCE
-
-
Constructor Details
-
CESU8Encoding
protected CESU8Encoding()
-
-
Method Details
-
getCharsetName
Description copied from class:EncodingThe name of the equivalent Java Charset for this encoding. Defaults to the name of the encoding. Subclasses can override this to provide a different name.- Overrides:
getCharsetNamein classUnicodeEncoding- Returns:
- the name of the equivalent Java Charset for this encoding
-
length
public int length(byte[] bytes, int p, int end) Description copied from class:EncodingReturns character length given stream, character position and stream end returns1for singlebyte encodings or performs sanity validations for multibyte ones and returns the character length, missing characters in the stream otherwise -
lengthForOneUptoSix
private int lengthForOneUptoSix(byte[] bytes, int p, int end, int b, int s) -
isNewLine
public boolean isNewLine(byte[] bytes, int p, int end) Description copied from class:AbstractEncodingonigenc_is_mbc_newline_0x0a / used also by multibyte encodings- Overrides:
isNewLinein classAbstractEncoding
-
codeToMbcLength
public int codeToMbcLength(int code) Description copied from class:EncodingReturns character length given a code point Oniguruma equivalent:code_to_mbclen- Specified by:
codeToMbcLengthin classEncoding
-
mbcToCode
-
trailS
static byte trailS(int code, int shift) -
trail0
static byte trail0(int code) -
trailS
static byte trailS(long code, int shift) -
trail0
static byte trail0(long code) -
codeToMbc
-
mbcCaseFold
Description copied from class:AbstractEncodingonigenc_ascii_mbc_case_fold- Overrides:
mbcCaseFoldin classUnicodeEncoding- Parameters:
flag- case fold flagpp- anIntHolderthat points at character headfold- a buffer where to extract case folded character Oniguruma equivalent:mbc_case_fold
-
ctypeCodeRange
Description copied from class:EncodingReturns code range for a given character type Oniguruma equivalent:get_ctype_code_range- Specified by:
ctypeCodeRangein classEncoding
-
utf8IsLead
private static boolean utf8IsLead(int c) -
leftAdjustCharHead
public int leftAdjustCharHead(byte[] bytes, int p, int s, int end) Description copied from class:EncodingSeeks the previous character head in a stream Oniguruma equivalent:left_adjust_char_head- Specified by:
leftAdjustCharHeadin classEncoding- Parameters:
bytes- byte streamp- positions- stopend- end
-
isReverseMatchAllowed
public boolean isReverseMatchAllowed(byte[] bytes, int p, int end) Description copied from class:EncodingReturns true if it's safe to use reversal Boyer-Moore search fail fast algorithm Oniguruma equivalent:is_allowed_reverse_match- Specified by:
isReverseMatchAllowedin classEncoding
-