Lasso Soft Inc. > Home

[lp_math_ucsToUTF8]

Linklp_math_ucsToUTF8
AuthorBil Corry
CategoryMath
Version8.x
LicensePublic Domain
Posted21 Apr 2006
Updated21 Apr 2006
More by this author...

Description

Returns a UTF-8 byte value given an UCS byte value or an array of byte values.

Requires [lp_math_octettodec] [lp_math_hextodec] [lp_math_dectobin] [lp_integer_bits] [lp_string_pad] [lp_math_bintodec]

Sample Usage

	lp_string_chr: (lp_math_ucstoUTF8: (lp_math_hextodec:'00E4')); '
'; // returns ä lp_string_chr: (lp_math_ucstoUTF8: '00E4', -hex); '
'; // returns ä // http://www.unicode.org/charts/PDF/U2070.pdf loop: -from=2080, -to=2089; lp_string_chr: (lp_math_ucstoUTF8: loop_count, -hex); /loop;

Source Code

Click the "Download" button below to retrieve a copy of this tag, including the complete documentation and sample usage shown on this page. Place the downloaded ".inc" file in your LassoStartup folder, restart Lasso, and you can begin using this tag immediately.

[

define_tag:'lp_math_ucsToUTF8',
	-description='Returns a UTF-8 byte value given an UCS byte value or an array of byte values.',
	-priority='replace',
	-required='bytes';

// http://www.ietf.org/rfc/rfc3629.txt
// http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
// http://www.unicode.org/charts/
// http://www.426.ch/ascii.html

/*
   Char. number range    |        UTF-8 octet sequence
      (hexadecimal)      |              (binary)
   ----------------------+-------------------------------------
   0000 0000 - 0000 007F | 0xxxxxxx
   0000 0080 - 0000 07FF | 110xxxxx 10xxxxxx
   0000 0800 - 0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx
   0001 0000 - 0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx


	My chart:

   Char.    |  UTF-8 octet sequence
   # bits   | (binary)
   ---------+-------------------------------------
   0  - 7   | 0xxxxxxx
   8  - 11  | 110xxxxx 10xxxxxx
   12 - 16  | 1110xxxx 10xxxxxx 10xxxxxx
   17 - 21  | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx




var:'test' = '1D120';
lp_math_hextobin: $test;'
'; lp_math_dectobin: (lp_math_ucstoUTF8: $test, -hex);'
'; 000 011101 000100 100000 11110 000 10 011101 10 000100 10 100000 */ if: params->(find:'-hex')->size;; // passed hex instead local:'dec' = (lp_math_octettodec: (lp_math_hextodec: #bytes)); else; local:'dec' = (lp_math_octettodec: #bytes); /if; local:'bits' = (lp_math_dectobin: #dec); local:'out' = string; if: (lp_integer_bits: #dec) <= 7; #out = (lp_string_pad: #bits, 7, '0'); else: (lp_integer_bits: #dec) <= 11; #bits = (lp_string_pad: #bits, 11, '0'); #out = '110' (#bits->(substring: 1, 5)) '10' (#bits->(substring: 6, 6)); else: (lp_integer_bits: #dec) <= 16; #bits = (lp_string_pad: #bits, 16, '0'); #out = '1110' (#bits->(substring: 1, 4)) '10' (#bits->(substring: 5, 6)) '10' (#bits->(substring: 11, 6)); else: (lp_integer_bits: #dec) <= 21; #bits = (lp_string_pad: #bits, 21, '0'); #out = '11110' (#bits->(substring: 1, 3)) '10' (#bits->(substring: 4, 6)) '10' (#bits->(substring: 10, 6)) '10' (#bits->(substring: 16, 6)); else; // out of bounds fail: '-1','Value out of bounds'; /if; return: (lp_math_bintodec: #out); /define_tag; ]

Comments

No comments

Please log in to comment

Subscribe to the LassoTalk mail list

LassoSoft Inc. > Home

 

 

©LassoSoft Inc 2015 | Web Development by Treefrog Inc | PrivacyLegal terms and Shipping | Contact LassoSoft