libthai 0.1.29
|
Thai word segmentation. More...
Functions | |
ThBrk * | th_brk_new (const char *dictpath) |
Create a dictionary-based word breaker. | |
void | th_brk_delete (ThBrk *brk) |
Delete a word breaker. | |
int | th_brk_find_breaks (ThBrk *brk, const thchar_t *s, int pos[], size_t pos_sz) |
Find word break positions in Thai string. | |
int | th_brk_insert_breaks (ThBrk *brk, const thchar_t *in, thchar_t *out, size_t out_sz, const char *delim) |
Insert word delimitors in given string. | |
int | th_brk (const thchar_t *s, int pos[], size_t pos_sz) |
Find word break positions in Thai string. | |
int | th_brk_line (const thchar_t *in, thchar_t *out, size_t out_sz, const char *delim) |
Insert word delimitors in given string. | |
Thai word segmentation.
|
extern |
Find word break positions in Thai string.
s | : the input string to be processed |
pos | : array to keep breaking positions |
pos_sz | : size of pos[] |
Finds word break positions in Thai string s and stores at most n breaking positions in pos[], from left to right. Uses the shared word breaker.
(This function is deprecated since version 0.1.25, in favor of th_brk_find_breaks(), which is more thread-safe.)
|
extern |
Delete a word breaker.
brk | : the word breaker |
Frees memory associated with the word breaker.
(Available since version 0.1.25, libthai.so.0.3.0)
|
extern |
Find word break positions in Thai string.
brk | : the word breaker |
s | : the input string to be processed |
pos | : array to keep breaking positions |
pos_sz | : size of pos[] |
Finds word break positions in Thai string s and stores at most pos_sz breaking positions in pos[], from left to right.
(Available since version 0.1.25, libthai.so.0.3.0)
|
extern |
Insert word delimitors in given string.
brk | : the word breaker |
in | : the input string to be processed |
out | : the output buffer |
out_sz | : the size of out |
delim | : the word delimitor to insert |
Analyzes the input string and store the string in output buffer with the given word delimitor inserted at every word boundary.
(Available since version 0.1.25, libthai.so.0.3.0)
Insert word delimitors in given string.
in | : the input string to be processed |
out | : the output buffer |
out_sz | : the size of out |
delim | : the word delimitor to insert |
Analyzes the input string and store the string in output buffer with the given word delimitor inserted at every word boundary. Uses the shared word breaker.
(This function is deprecated since version 0.1.25, in favor of th_brk_insert_breaks(), which is more thread-safe.)
|
extern |
Create a dictionary-based word breaker.
dictpath | : the dictionary path, or NULL for default |
Loads the dictionary from the given file and returns the created word breaker. If dictpath is NULL, first searches in the directory given by the LIBTHAI_DICTDIR environment variable, then in the library installation directory. Returns NULL if the dictionary file is not found or cannot be loaded.
The returned ThBrk object should be destroyed after use using th_brk_delete().
In multi-thread environments, th_brk_new() and th_brk_delete() should be used to create and destroy a word breaker instance inside critical sections (i.e. with mutex). And the word breaker methods can then be safely called in parallel during its lifetime.
(Available since version 0.1.25, libthai.so.0.3.0)