class UnicodeUtils::Codepoint

A Codepoint instance represents a single Unicode code point.

UnicodeUtils::Codepoint.new(0x20ac) => #<U+20AC "€" EURO SIGN utf8:e2,82,ac>

Constants

RANGE

The Unicode codespace. Any integer in this range is a Unicode code point.

Public Class Methods

new(int) click to toggle source

Create a Codepoint instance that wraps the given Integer. int must be in Codepoint::RANGE.

# File lib/unicode_utils/codepoint.rb, line 17
def initialize(int)
  unless RANGE.include?(int)
    raise ArgumentError, "#{int} not in codespace"
  end
  @int = int
end

Public Instance Methods

hexbytes() click to toggle source

Get the bytes used to encode this code point in UTF-8, hex-formatted.

Codepoint.new(0xe4).hexbytes => "c3,a4"
# File lib/unicode_utils/codepoint.rb, line 54
def hexbytes
  to_s.bytes.map { |b| sprintf("%02x", b) }.join(",")
end
inspect() click to toggle source

<U+… char name utf8-hexbytes>

# File lib/unicode_utils/codepoint.rb, line 59
def inspect
  "#<#{uplus} #{to_s.inspect} #{name || "nil"} utf8:#{hexbytes}>"
end
name() click to toggle source

Get the normative Unicode name of this code point.

See also: UnicodeUtils.char_name

# File lib/unicode_utils/codepoint.rb, line 39
def name
  UnicodeUtils.char_name(@int)
end
ord() click to toggle source

Convert to Integer.

# File lib/unicode_utils/codepoint.rb, line 25
def ord
  @int
end
to_s() click to toggle source

Convert this code point to an UTF-8 encoded string. Returns a new string on each call and thus it is allowed to mutate the return value.

# File lib/unicode_utils/codepoint.rb, line 46
def to_s
  @int.chr(Encoding::UTF_8)
end
uplus() click to toggle source

Format in U+ notation.

Codepoint.new(0xc5).uplus => "U+00C5"
# File lib/unicode_utils/codepoint.rb, line 32
def uplus
  sprintf('U+%04X', @int)
end