2018-08-20  Theppitak Karoonboonyanan <theppitak@gmail.com>

	* NEWS:
	  === Version 0.6.1 ===

2018-08-18  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Add info to skip a step on released tarballs

	* INSTALL:
	  - Add info to skip ./autogen.sh step on released tarballs.

2018-08-18  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Provide our own version of INSTALL instruction.

	* INSTALL:
	  - Explain simple installation steps.

	Merging pull request #3 from @pepa65
	https://github.com/tlwg/swath/pull/3

2018-08-14  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Fix compiler warning on uncovered enum case.

	* src/filterrtf.cpp (FilterRTF::GetNextToken()):
	  - Add default: to switch(), to fix warning on uncovered enum.
	  - Remove an unnecessary break after goto.

2018-08-14  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Add comment for previous commit.

	* src/filterrtf.cpp (FilterRTF::chgCharState()):
	  - Add comment explaining unreachable line.

2018-08-14  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Drop unused CS_NONE ECharState in FilterRTF.

	The only place it's used is the return value of chgCharState()
	method, which has no chance to be reached, as the 'state' arg,
	which is from FilterRTF::psState, is never set to CS_NONE
	through out the code. So, all possible values have been covered
	in 'switch (state)' and compiler warning on uncovered CS_NONE
	is in fact meaningless.

	Besides, having a "none" value for a state in finite automaton
	is questionable. Is it a special dead state? No, it's never
	handled explicitly anyways.

	* src/filterrtf.h (FilterRTF::ECharState):
	  - Drop CS_NONE value.
	* src/filterrtf.cpp (FilterRTF::chgCharState()):
	  - Drop unreachable "return CS_NONE" statement.

2018-08-13  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Fix inverted logic in RTFToken::isEmpty().

	* src/filterrtf.cpp (RTFToken::isEmpty()):
	  - Fix inverted logic, which caused some token to be not flushed.
	* tests/petavatthu1-wseg.rtf:
	  - Update checking wordseg output.

2018-08-13  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Add RTF test case.

	* tests/Makefile.am, +tests/test-rtf.sh,
	  +tests/petavatthu1.rtf, +tests/petavatthu1-wseg.rtf:
	  - Add RTF test case

2018-08-13  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Drop unused method.

	* src/filterrtf.cpp (-RTFToken::set(char, ETokenType)):
	  - Drop unused method.

2018-08-13  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Declare array size member as size_t.

	* src/filterrtf.cpp (RTFToken):
	  - Declare RTFToken::valLen as size_t instead of int,
	    fixing "signedness" compiler warning when comparing size.

2018-08-13  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Add missing virtual d-tors.

	* conv/convkit.h (TextReader, TextWriter):
	  - Add missing virtual d-tors to the abstract base classes,
	    fixing compiler warnings.

2018-08-13  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Drop commented midatrie stuff.

	* src/Makefile.am:
	  - Drop commented stuff on using ancient midatrie.

2018-08-10  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Normalize include guard macros in headers.

	* conv/conv.h:
	* conv/convfact.h:
	* conv/convkit.h:
	* conv/tis620.h:
	* conv/tischar.h:
	* conv/unichar.h:
	* conv/utf8.h:
	  - Normalize include guard macros based on file name.

2018-08-10  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Drop "#pragma once" in headers.

	It's not portable, and multiple include optimization
	in compilers just make "#ifndef" idiom have the same effect.

	* src/abswordseg.h:
	* src/filefilter.h:
	* src/filterhtml.h:
	* src/filterlambda.h:
	* src/filterlatex.h:
	* src/filterrtf.h:
	* src/filterx.h:
	* src/longwordseg.h:
	* src/maxwordseg.h:
	* src/wordstack.h:
	  - Drop "#pragma once".

2018-08-08  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Use HTTPS in linux.thai.net URLs

	* AUTHORS:
	  - Use HTTPS in thailatex and libthai project URLs.

2018-08-05  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Update dictionary from thailatex GitHub.

	* data/Makefile.am:
	* data/tdict-city.txt:
	* data/tdict-common.txt:
	* data/tdict-country.txt:
	* data/tdict-district.txt:
	* data/tdict-geo.txt:
	* data/tdict-history.txt:
	* data/tdict-ict.txt:
	* data/tdict-lang-ethnic.txt:
	* data/tdict-proper.txt:
	* data/tdict-science.txt:
	* +data/tdict-slang.txt:
	* data/tdict-spell.txt:
	* data/tdict-std.txt:
	  - Update dictionary from dehyphenated source for thailatex
	    hyphenation patterns (based on libthai 0.1.28).
	  - tdict-slang.txt is added according to libthai.

2018-01-04  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Update my e-mail address in conv.

	* conv/*.h, conv/*.cxx:
	  - Replace my nectec e-mail address with the gmail one.

2018-01-04  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Remove unnecessary includes in conv.

	* conv/conv.cxx, conv/convfact.h, conv/tis620.h, conv/utf8.h:
	  - Remove stdio.h include artifacts.
	* conv/conv.cxx:
	  - Remove unnecessary "utf8.h" and "tis620.h" includes, as
	    they are all realizations of "convkit.h" abstractions.
	    "convfact.h" already provides all necessary abstractions.

2018-01-04  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Replace the last malloc() with new[] operator.

	* src/convutil.cpp (ConvPrint()):
	  - Replace malloc() call with C++ new[] operator.
	  - Remove now-unneeded stdlib.h include.
	* configure.ac:
	  - Remove unnecessary check for malloc.h.

2018-01-04  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Remove unnecessary check for unused strstr().

	* configure.ac:
	  - Remove check for strstr(), which is never used.

2018-01-04  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Remove the dead midatrie stuffs.

	* configure.ac:
	  - Remove comment lines for using midatrie. It has been dead.
	  - Move the datrie check up to library checks phase.

2017-11-28  Theppitak Karoonboonyanan <theppitak@gmail.com>

	* NEWS:
	  === Version 0.6.0 ===

2017-11-27  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Revise manpage.

	* src/swath.1:
	  - Improve wordings.
	  - Mention trietool(1) instead of trietool-0.2(1).

2017-11-26  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Remove LinkSep terminator.

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList()):
	  - Remove LinkSep[] terminator assignment, which was previously
	    required by ReverseLinkSep(), where LinkSep[] was processed
	    from start to end.

2017-11-24  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Merge loop in MaxWordSeg::saveSegment().

	* src/maxwordseg.cpp (MaxWordSeg::saveSegment()):
	  - Optimize array access using pointers.
	  - Merge extra iteration on the last element into the main loop,
	    checking for the terminator so iteration over the last
	    element is possible, and the split of the last element copying
	    from the rests before it is not necessary.

2017-11-23  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Only reverse word list on Thai chunks.

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList()):
	  - Add word list reversion when a Thai chunk is done.
	    Other chunks don't need it.
	* src/abswordseg.cpp (AbsWordSeg::WordSeg()):
	  - Remove ReverseLinkSep() call.
	* src/abswordseg.h, src/abswordseg.cpp
	  (-AbsWordSeg::ReverseLinkSep()):
	  - Drop the method, now unneeded.

2017-11-23  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Reorder WordStack member declarations.

	* src/wordstack.h (class WordStack):
	  - Move private section to the end.

2017-11-22  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Chain cases in AbsWordSeg::CreateWordList() together.

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList()):
	  - Chain chunk type cases into a single if-else-statement,
	    instead of several if-statements with 'continue' at the end,
	    to clarify the classification checks.

2017-11-21  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Make WordSeg() internal functions private.

	* src/abswordseg.h:
	  - Move CreateWordList(), ReverseLinkSep(), CreateSentence(),
	    GetBestSen() methods from protected to private.

2017-11-21  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Rename AbsWordSeg::SwapLinkSep() to ReverseLinkSep().

	* src/abswordseg.h, src/abswordseg.cpp:
	  - Rename AbsWordSeg::SwapLinkSep() to ReverseLinkSep().

2017-11-21  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Remove empty default c-tors.

	* src/longwordseg.h, src/longwordseg.cpp:
	* src/maxwordseg.h, src/maxwordseg.cpp:
	  - Remove empty default c-tors for LongWordSeg & MaxWordSeg.

2017-11-20  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Simplify loop in MaxWordSeg::WordSegArea().

	* src/maxwordseg.cpp (MaxWordSeg::WordSegArea()):
	  - Rearrange loop for more readability.
	  - With this, foundUnk flag is eliminated.

2017-11-17  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Adjust MaxWordSeg::CreateSentence().

	* src/maxwordseg.cpp (MaxWordSeg::CreateSentence()):
	  - Assign stWord = i + 1 after unknown zone, to resemble
	    the case of in-dict word (where stWord is assigned 1, not 0).
	    Then, the later for-loop on stWord won't need to check and
	    increment stWord the first time, as we already know
	    LinkSep[IdxSep[stWord]] == enAmb, and the if-condition
	    within the for-loop will always fail in the first round.
	  - Drop unnecessary (stWord < textLen) check, as enAmb is assumed
	    to never exceed textLen, and thus (stWord < enAmb) should
	    alreader imply (stWord < textLen).
	  - Merge the (IdxSep[stWord] < 0) check into next if-condition.

2017-11-16  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Make LongWordSeg try other paths than just the first.

	* src/longwordseg.cpp (LongWordSeg::CreateSentence()):
	  - Wrap the text iteration loop with backtrack loop,
	    employing the BackTrackStack.
	  - Keep track of unknown word count.
	  - Return immediately if text end is reached with zero unknown.
	  - Otherwise, try to find the best sentence with least unknowns.

2017-11-15  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Simplify LongWordSeg::CreateSentence().

	* src/longwordseg.cpp (LongWordSeg::CreateSentence()):
	  - Use intermediate WordState instantiation and get rid of
	    wState var.
	  - Move common SepData[senIdx].Sep[sepIdx] assignments out of
	    the cases of 'if (IdxSep[Idx] >= 0)' block.
	  - Do not check Idx against textLen inside the while-loop,
	    which already checks the same condition.
	  - Move common SepData[senIdx].Sep[sepIdx] terminator assignments
	    out of the cases of 'if (SepData[senIdx].Sep[sepIdx - 1]
	    == textLen)' block.

2017-11-15  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Capitalize wordState type.

	* src/worddef.h:
	* src/wordstack.h:
	* src/longwordseg.cpp:
	* src/maxwordseg.cpp:
	  - Rename wordState type to WordState.

2017-11-15  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Add test in which long and max results differ.

	* tests/test-simple.sh:
	  - Add test case in which longest matching and maximal matching
	    schemes yield different results.
	  - Also print input string on each test step for debugging.

2017-11-11  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Add test for longest matching scheme to all tests.

	* tests/test-simple.sh:
	* tests/test-long.sh:
	* tests/test-mixed.sh:
	* tests/test-utf8-wbr.sh:
	* tests/test-latex.sh:
	  - Repeat all tests one more time, with different -m option.
	    (This seems to catch an error on longest matching.)

2017-11-10  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Remove excessive LinkSep terminator.

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList()):
	  - Remove extra LinkSep[] terminator assignment.
	    Only one is enough.

2017-11-10  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Adjust SwapLinkSep() loop more.

	* src/abswordseg.cpp (AbsWordSeg::SwapLinkSep()):
	  - Rename variables: st_idx -> start; en_idx -> end.
	  - Instead of saving 'end' and directly iterating on it,
	    define another 'upper' iterator for that purpose.
	    Then, 'end' value can be used at the end.

2017-11-10  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Simplify SwapLinkSep() loop.

	* src/abswordseg.cpp (AbsWordSeg::SwapLinkSep()):
	  - Move en_idx, end_point, tmp vars inside loop
	    where they are used. Only st_idx is left on top level.
	  - No more en_idx assignment at loop end.
	  - Stop en_idx exactly at LinkSep[] element with value -1,
	    not the one after.
	  - Simply assign end_point as saved storage of en_idx value.
	    No computation is needed.
	  - Simply decrement en_idx to get the last non-terminate element.
	    No confusing skip by 2.
	  - Simplify the swap code inside inner while-loop.

2017-11-10  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Set IdxSep[] on every char instead of relying on defaults.

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList()):
	  - Instead of skipping IdxSep[i] assignments and relying on
	    initial values on some chars, let's assign them explicitly.

	* src/abswordseg.cpp (AbsWordSeg::WordSeg()):
	  - Remove now-unneeded InitData() call.
	* src/abswordseg.h, src/abswordseg.cpp (-AbsWordSeg::InitData()):
	  - Remove now-unneeded InitData() method.

2017-11-10  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Add test for mixed text.

	* tests/Makefile.am,
	  +tests/test-mixed.sh, +tests/mixed.txt, +tests/mixed-wseg.txt:
	  - Add test case for text mixed with English text, digits,
	    Thai digits.
	* tests/Makefile.am:
	  - Clear test outputs on 'clean' instead of 'distclean'.

2017-11-10  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Replace for-loop with while-loop.

	text[i] iteration has irregular increments, as decrements
	before continue appear in many places. This indicates
	improper use of for-loop.

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList()):
	  - Replace for-loop with while-loop, and increment the iterator
	    as actually required.
	  - Remove unnecessary decrements.
	  - Rewrite comments and regroup source lines.

2017-11-10  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Adjust IdxSep[] and LinkSep[] settings in Thai chunk.

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList()):
	  - Replace cntFound counter with bool isWordFound.
	  - Initialize IdxSep[i] = -1 and overwrite it later,
	    instead of setting it afterward when no word is found.
	  - Always add terminating element to LinkSep[] in case
	    word is found, regardless of whether the word count is
	    below 2000. (Weird limit value.)
	  - Remove now-unnecessary LinkSep[cntLink] = -1 assignment
	    on pseudo-ending element.

2017-11-10  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Simplify word list preparation loop within Thai chunk.

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList()):
	  - Iterate j from i to textLen instead of from 0 to (textLen - i)
	    and then simply access text[j] instead of text[i + j].
	  - Rearrange IdxSep[i] assignment and cntFound increment,
	    for more readability.
	  - Adjust LinkSep[cntLink] assignments logic to resemble those in
	    other codes.

2017-11-09  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Refactor dictionary out of AbsWordSeg.

	* src/abswordseg.h, src/abswordseg.cpp,
	  src/Makefile.am, +src/dict.h, +src/dict.cpp, -src/dictpath.h:
	  - New class Dict to wrap up Trie, with Dict::State to wrap up
	    TrieState.
	  - Move AbsWordSeg::InitDict() functionality into Dict::open().
	  - Remove AbsWordSeg::MyDict member.
	* src/abswordseg.h, src/abswordseg.cpp:
	  - Add dict arg to AbsWordSeg::WordSeg() and
	    AbsWordSeg::CreateWordList(), and use it instead of
	    the old MyDict.
	  - Replace TrieState type with Dict::State.

	* src/wordseg.cpp:
	  - [InitWordSegmentation(), +OpenDict()] Split dictionary opening
	    into a separate function, and simplify the rest code.
	  - [main()] Define Dict local instance and call OpenDict() upon it.
	    Drop dictpath parameter from InitWordSegmentation() call.
	  - [WordSegmentation()] Add dict arg and pass it on to
	    AbsWordSeg::WordSeg().
	  - [main()] Add dict parameter to WordSegmentation() calls,
	    as now required.

2017-11-07  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Replace type for Unicode char with 'wchar_t' where it isn't.

	* src/worddef.h (isThaiUni, tis2uni, uni2tis):
	  - Replace 'unsigned int' type for Unicode char with 'wchar_t'.

2017-11-07  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Update test check file according to new dict.

	* tests/long-wseg.txt:
	  - 'เท่าไหร่' is now a single word, not 'เท่า|ไหร่'.

2017-11-07  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Rename AbsWordSeg::Has_Karun() to HasKaran().

	* src/abswordseg.h, src/abswordseg.cpp:
	  - Rename AbsWordSeg::Has_Karun() to HasKaran().

2017-11-06  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Replace mysterious 'mode' var with 'verbose'.

	* src/wordseg.cpp (main):
	  - Replace "char mode" variable with "bool verbose".
	  - Replace all assignments & checks of the var based on bool.

2017-11-06  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Drop commented code.

	* src/wordseg.cpp:
	  - Drop commented code for old WORDSEGDATA_DIR.

2017-11-06  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Drop undocumented option '-l'.

	* src/wordseg.cpp (Usage):
	  - Drop commented code mentioning '-l' option.
	* src/wordseg.cpp (main):
	  - Drop check for '-l' command line arg.
	  - Eliminate code for 'wholeLine == true' case.
	  - Drop now-unused 'wholeLine' var.

2017-11-06  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Drop remaining comments mentioning shaping support.

	* src/wordseg.cpp (Usage):
	  - Drop commented codes mentioning 'winlatex' and 'maclatex'
	    options.

2017-11-06  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Drop unnecessary includes in C sources.

	* src/abswordseg.cpp:
	  - Drop "worddef.h", already included in "abswordseg.h"
	* src/wordseg.cpp:
	  - Drop unused <time.h>

2017-11-02  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Update dictionary from thailatex GitHub.

	* data/tdict-city.txt:
	* data/tdict-common.txt:
	* data/tdict-district.txt:
	* data/tdict-geo.txt:
	* data/tdict-history.txt:
	* data/tdict-ict.txt:
	* data/tdict-lang-ethnic.txt:
	* data/tdict-proper.txt:
	* data/tdict-science.txt:
	* data/tdict-spell.txt:
	  - Update dictionary from dehyphenated source for thailatex
	    hyphenation patterns (based on libthai 0.1.27).

2016-12-25  Theppitak Karoonboonyanan <theppitak@gmail.com>

	* NEWS:
	  === Version 0.5.5 ===

2016-12-15  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Update dictionary from thailatex GitHub.

	* data/tdict-city.txt:
	* data/tdict-common.txt:
	* data/tdict-geo.txt:
	* data/tdict-ict.txt:
	* data/tdict-proper.txt:
	* data/tdict-science.txt:
	* data/tdict-spell.txt:
	* data/tdict-std-compound.txt:
	* data/tdict-std.txt:
	  - Update dictionary from dehyphenated source for thailatex
	    hyphenation patterns.

2016-12-15  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Include git-version-gen in tarball

	* Makefile.am:
	  - Add build-aux/git-version-gen to EXTRA_DIST.

2016-07-11  Theppitak Karoonboonyanan <theppitak@gmail.com>

	Use versioning based on Git snapshot.

	* Makefile.am:
	  - Add dist-hook to generate VERSION file on tarball generation.
	* +build-aux/git-version-gen:
	  - Add script to generate version based on 'git describe'
	    if in git tree, or using VERSION file if in release tarball.
	* configure.ac:
	  - Call git-version-gen to get package version.

2016-07-08  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	* configure.ac, NEWS:
	=== Version 0.5.4 ===

2016-07-08  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Clean intermediate file on LaTeX test.

	* tests/Makefile.am:
	  - Add thai-latex-out.tex to DISTCLEANFILES.

2016-07-08  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Add test case for LaTeX.

	* tests/Makefile.am, +tests/test-latex.sh,
	  +tests/thai-latex.tex, +tests/thai-latex-wseg.tex:
	  - Add Thai LaTeX sample for testing.

2016-07-08  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Remove debug option from test.

	* tests/test-utf8-wbr.sh:
	  - Remove '-x' sh option.

2016-07-08  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Fix exit status on error.

	* tests/test-simple.sh:
	  - On error, exit with code 1, as -1 is illegal for sh.

2016-07-06  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Add test case for UTF-8 word break code.

	* tests/Makefile.am, +tests/test-utf8-wbr.sh:
	  - Add test case for non-ASCII word break string,
	    for Issue #1.

2016-07-06  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Add test case for long line.

	* tests/Makefile.am, +tests/test-long.sh,
	  +tests/long.txt, +tests/long-wseg.txt:
	  - Add test script and sample input with long line.

2016-07-06  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Add test suite, with simple test case.

	* configure.ac, Makefile.am:
	  - Add tests/ subdir.

	* +tests/Makefile.am, +tests/test-simple.sh:
	  - Add a simple test case.

2016-07-03  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Fix mem leak on file format error.

	* src/wordseg.cpp (main):
	  - Exit word segmentation before exiting program.

2016-07-02  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Check memory failure in InitWordSegmentation().

	* src/wordseg.cpp (InitWordSegmentation):
	  - Exit on memory allocation failure.

2016-07-02  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Simplify InitWordSegmentation().

	* src/wordseg.cpp (InitWordSegmentation):
	  - Rewrite to return (AbsWordSeg*), with NULL indicating failure.
	  - Simplify method determination logic, with fallback.
	  - Simplify dictionary trial logic.

	* src/wordseg.cpp (main):
	  - Call InitWordSegmentation() with new convention.

2016-07-01  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Support longer line length.

	* src/worddef.h (MAXLEN, MAXSEP):
	  - Increase max line length and max words by 3 times,
	    to support longer paragraphs. We had better address this
	    with resizable buffers, but the change would be too deep
	    for now.

	* src/worddef.h (-MAXCHOICE, -MAXTAG):
	  - Drop unused macros.

	Thanks Santi Romeyen for the report via a personal e-mail.

2016-07-01  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Update dictionary from thailatex GitHub.

	* data/tdict-city.txt:
	* data/tdict-common.txt:
	* data/tdict-country.txt:
	* data/tdict-district.txt:
	* data/tdict-ict.txt:
	* data/tdict-lang-ethnic.txt:
	* data/tdict-proper.txt:
	* data/tdict-science.txt:
	* data/tdict-spell.txt:
	  - Update dictionary from dehyphenated source for thailatex
	    hyphenation patterns.

2016-06-21  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Interpret word break string using output encoding.

	As an adjustment to issue #1 fix, taking the word break string
	in output encoding should be more sensible to users than in
	input encoding, so s/he can directly specify the string to insert.

	* src/wordseg.cpp (main):
	  - Convert wbr depending on isUniOut instead of isUniIn.

2016-06-20  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Fix issue #1 - Support non-ASCII word break string.

	Word break string should be read & converted the same way
	input text is.

	* src/convutil.h, src/convutil.cpp (+ConvStrDup):
	  - Add ConvStrDup() utility for converting string in buffers.

	* src/filterx.h (wordBreakStr, GetWordBreak(), FilterX()):
	  - Change type of word break string from const char*
	    to const wchar_t*.
	* src/filterhtml.cpp:
	* src/filterlatex.h:
	* src/filterlatex.cpp:
	* src/filterlambda.h:
	* src/filterrtf.cpp:
	  - Change argument types accordingly.

	* src/wordseg.cpp (WordSegmentation):
	  - Change type of wbr from const char* to const wchar_t*.
	  - Use wchar functions accordingly.

	* src/wordseg.cpp (main):
	  - Prepare wchar_t version of word break string as needed.
	  - Call WordSegmentation() with wchar_t version
	    of word break strings.
	  - Print trailing word break string explicitly in mule mode.

	Thanks GitHub @pepa65 for the report.

2016-05-06  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Fix segfault on extremely long input.

	* src/convutil.cpp (ConvGetS):
	  - Check for output buffer size when writing.

2015-05-07  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Update dictionary from texhyphen SVN.

	* data/tdict-city.txt:
	* data/tdict-collection.txt:
	* data/tdict-common.txt:
	* data/tdict-country.txt:
	* data/tdict-ict.txt:
	* data/tdict-proper.txt:
	* data/tdict-science.txt:
	* data/tdict-spell.txt:
	  - Update dictionary from dehyphenated source for texhyphen
	    hyphenation patterns.

2015-03-07  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Also check for 'trietool' (without -0.2 suffix)

	* configure.ac:
	  - Check for both trietool-0.2 (for libdatrie < 0.2.9) and
	    trietool (for libdatrie >= 0.2.9) utility.

2015-03-07  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	* configure.ac: Post-release version suffix added.

2014-09-01  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	* configure.ac, NEWS:
	=== Version 0.5.3 ===

2014-08-31  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Drop more unused code.

	* src/filterlatex.cpp:
	  - Drop unused vars & functions previously used by the removed
	    function FilterLatex::AdjustText().

2014-08-31  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Get rid of C-style typedef for structs.

	* src/worddef.h:
	  - Just use C++ typing for structs.

2014-08-31  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Get rid of C-style typedef for enum.

	* src/wordseg.cpp (enum TextToken):
	  - Just use C++ typing.

2014-08-30  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Drop unneeded include.

	* src/wordseg.cpp:
	  - Drop unneeded "dictpath.h" include.

2014-08-30  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Drop unneeded include.

	* src/maxwordseg.cpp:
	  - Drop unneeded standard C includes.

2014-08-30  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Drop unneeded include.

	* src/longwordseg.cpp:
	  - Drop unneeded "dictpath.h" & standard C includes.

2014-08-30  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Drop unused function & member.

	* src/filterlatex.h, src/filterlatex.cpp (FilterLatex::AdjustText):
	  - Drop unused member function.
	* src/filterlatex.h, src/filterlatex.cpp (FilterLatex::FilterLatex):
	  - Drop now-unused member 'winCharSet'.
	  - Drop c-tor parameter 'latexflag' used for winCharSet
	    initialization.
	* src/filterlambda.h (FilterLambda::FilterLambda):
	  - Drop c-tor parameter 'latexflag'.

2014-08-30  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Drop unused member.

	* src/filterlatex.h, src/filterlatex.cpp (class FilterLatex):
	  - Drop unused member 'latexFlag'.

2014-08-30  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	* README: Update thailatex info to babel-thai.

2014-08-30  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Update copyright info.

	* src/wordseg.cpp (Version):
	  - Update copyright year.
	  - Update my e-mail address.

2014-08-29  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Update man page.

	* src/swath.1:
	  - Revise wording.
	  - Use two spaces to delimit sentences.
	  - Update thailatex info to babel-thai.
	  - Update my e-mail address.

2014-08-29  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	* README: Update my e-mail address.

2014-08-28  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Fix excessive break positions in plain text mode.

	* src/wordseg.cpp (SplitToken):
	  - Distingish Thai tokens from non-Thai.
	  - Make return values of SplitToken() more readable with enum.
	* src/wordseg.cpp (main):
	  - In plain text mode, only feed WordSegmentation() calls with
	    Thai-only tokens, to prevent extra break positions in non-Thai
	    zones. This imitates the behavior of formatted text modes.

	Thanks Sorawee Porncharoenwase for the report via Google+.

2014-08-24  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Fix convutil buffer preparation for UTF-8.

	* src/convutil.cpp (ConvGetS, ConvPrint):
	  - Prepare 5 bytes per char for UTF-8 buffer.

	Thanks Sorawee Porncharoenwase for the report via Google+.

2014-08-22  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	Update dictionary from texhyphen SVN.

	* data/tdict-city.txt:
	* data/tdict-common.txt:
	* data/tdict-district.txt:
	* data/tdict-geo.txt:
	* data/tdict-ict.txt:
	* data/tdict-lang-ethnic.txt:
	* data/tdict-proper.txt:
	* data/tdict-science.txt:
	* data/tdict-spell.txt:
	* data/tdict-std-compound.txt:
	* data/tdict-std.txt:
	  - Update dictionary from dehyphenated source for texhyphen
	    hyphenation patterns.

2014-08-22  Theppitak Karoonboonyanan  <theppitak@gmail.com>

	* configure.ac: Post-release version suffix added.

2013-12-23  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.ac, NEWS:
	=== Version 0.5.2 ===

2013-12-21  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* AUTHORS: Set me as Current maintainer.

2013-12-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Fix off-by-one character loss in long HTML token.

	* src/filterhtml.cpp (FilterHtml::GetNextToken):
	  - On verge of full buffer, jump to 'chbuff' storing stage,
	    rather than just quitting the loop, so the first character
	    of the next part of the long token is not lost.
	  - Also decrement 'tokenSz' on first character. Missing this
	    could cause buffer overflow.

	Thanks Nicolas Brouard for the report.

2013-12-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Fix another infinite loop on EOF after punct or Thai chunk.

	* src/filterlatex.cpp (FilterLatex::GetNextToken):
	  - Update buffer after comsumeToken(), so next reads won't get the
	    same contents again.

2013-12-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Adjust source.

	* src/filterlatex.cpp (FilterLatex::GetNextToken):
	  - Unindent else-block after the dead if-block.

2013-12-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Fix infinite loop on verbatim input.

	* src/filterlatex.cpp (FilterLatex::GetNextToken):
	  - Update buffer after comsumeToken(), so next reads won't get the
	    same contents again.

	Thanks Neutron Soutmun for the report and patch.

2013-12-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.ac: Post-release version suffix added.

2013-10-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.ac, NEWS:
	=== Version 0.5.1 ===

2013-10-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Adjust FilterRTF::GetNextToken() to prevent overflow.

	* src/convutil.h, src/convutil.cpp (+Ascii2WcsNCopy, +Ascii2WcsNCat):
	  - Add utility functions for copying ASCII to wide char string
	    within limited count.
	* src/filterrtf.cpp (FilterRTF::GetNextToken):
	  - Check for token size before writing to prevent buffer overflow.

	TODO: Use C++ wstring and string instead?

2013-10-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Adjust FilterHtml::GetNextToken() to prevent overflow.

	* src/filterhtml.cpp (FilterHtml::GetNextToken):
	  - Check for token size before writing to prevent buffer overflow.

2013-10-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Adjust FilterX::GetNextToken() for more safety.

	* src/filterx.h, src/filterhtml.h, src/filterlatex.h, src/filterrtf.h,
	  src/filterhtml.cpp, src/filterlatex.cpp, src/filterrtf.cpp,
	  (FilterX::GetNextToken):
	  - Add 'tokenSz' arg so that buffer overflow can be prevented.
	* src/wordseg.cpp (main):
	  - Adjust call to FilterX::GetNextToken() accordingly.

	* src/filterlatex.cpp (FilterLatex::GetNextToken):
	  - Implement the adjusted interface to make sure 'token' is never
	    overflown when writing data.
	* src/Makefile.am, +src/utils.h:
	  - Add "utils.h" for utility functions (first used in
	    FilterLatex::GetNextToken()).
	* src/convutil.h, src/utils.h, src/wordseg.cpp (N_ELM):
	  - Move N_ELM() macro from convutil.h to utils.h.

2013-10-23  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Update dictionary from thailatex SVN.

	* data/tdict-common.txt:
	* data/tdict-district.txt:
	* data/tdict-geo.txt:
	* data/tdict-history.txt:
	* data/tdict-ict.txt:
	* data/tdict-lang-ethnic.txt:
	* data/tdict-proper.txt:
	* data/tdict-science.txt:
	* data/tdict-spell.txt:
	* data/tdict-std-compound.txt:
	* data/tdict-std.txt:
	  - Update dictionary from dehyphenated source for thailatex
	    hyphenation patterns.

2013-10-23  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Fix automake warnings.

	* src/Makefile.am:
	  - Replace deprecated INCLUDES with AM_CPPFLAGS.

2013-05-13  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Lambda's word break should be ZWSP (U+200B).

	* src/filterlambda.h (FilterLambda::FilterLambda):
	  - Change wbr default arg from '^^^^200c' to '^^^^200b'.
	* src/wordseg.cpp (Usage):
	  - Fix text explaining Lambda's word break.

2013-05-13  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.ac: Post-release version suffix added.

2013-02-11  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.ac, NEWS:
	=== Version 0.5.0 ===

2013-02-01  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Fix non-srcdir build failure.

	* data/Makefile.am:
	  - Read swathdic.lst from build dir, not source dir, as it's now
	    generated file.

2013-01-28  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Modernize autoconf & switch to XZ tarball compression.

	* configure.in -> configure.ac:
	  - Rename file for modern autoconf.
	  - Require autoconf2.50.
	  - Replace deprecated AC_INIT() form with PACKAGE, VERSION form.
	  - Also add BUG-REPORT address.
	  - Use AC_CONFIG_SRCDIR() instead for the old AC_INIT() form.
	  - Replace deprecated AC_INIT_AUTOMAKE() form with OPTIONS form,
	    passing "dist-xz no-dist-gzip" options to switch to XZ tarball.
	  - Drop unneeded AC_SUBST(VERSION)

2013-01-28  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Update dictionary from thailatex SVN.

	* data/tdict-city.txt:
	* data/tdict-common.txt:
	* data/tdict-district.txt:
	* data/tdict-geo.txt:
	* data/tdict-ict.txt:
	* data/tdict-lang-ethnic.txt:
	* data/tdict-proper.txt:
	* data/tdict-science.txt:
	* data/tdict-spell.txt:
	* data/tdict-std-compound.txt:
	  - Update dictionary from dehyphenated source for thailatex
	    hyphenation patterns.

2013-01-23  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean-ups.

	* src/longwordseg.cpp (LongWordSeg::CreateSentence):
	  - Drop unused vars 'nextSepIdx' and 'curState'.
	* src/wordstack.h (WordStack::Top, WordStack::Empty):
	  - Make read-only methods const.

	Thanks Neutron Soutmun for the cppheck.

2013-01-23  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean-ups.

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList):
	  - Get rid of unused vars 'amb_sep_cnt'.
	* src/abswordseg.cpp (AbsWordSeg::GetBestSen):
	  - Reorder condition so the index 't' is range-checked before use.

	Thanks Neutron Soutmun for the cppheck.

2013-01-15  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Fix potential buffer overflow in Mule mode.

	* src/wordseg.cpp (main):
	  - Make stopstr a simple pointer to wbr or "".
	    strcpy() to fixed array can cause buffer overflow vulnerability.

	Thanks Dominik Maier for the report.
	http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=698189

2013-01-15  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Fix non-Mule mode by not inserting wbr at the end of line.

	* src/wordseg.cpp (main):
	  - Print stopstr instead of wbr in !wholeLine case.

2012-06-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	More safety on checking RTFToken emptiness.

	* src/filterrtf.cpp (class RTFToken):
	  - Add isEmpty() method for checking emptiness.
	  - Also initialize val[] string on reset(), as some code did check
	    its emptiness by peeking this string.
	* src/filterrtf.cpp (FilterRTF::chgCharState):
	  - Check rtfToken emptiness with the new isEmpty() method.

2012-06-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Fix buffer overflow bug caused by some RTF docs.

	* src/filterrtf.h (class FilterRTF):
	  - Allocate more space for strbuff[] member. 5 was too few when
	    facing non-Thai characters like '\u65533' (plus its skip bytes).
	    The safe value is 37, for U+7FFFFFFF.

2012-06-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean up.

	* src/filterrtf.cpp (FilterRTF::GetNextToken):
	  - Remove misleading comment.
	  - Make if-condition more concise.

2012-06-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Fix memleak.

	* src/convutil.cpp (ConvPrint):
	  - Delete allocated TextWriter object before exit.

2012-06-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean up.

	* src/worddef.h (isPunc):
	  - Drop unused and meaningless function.

2012-06-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Handle CR (80..A0) non-Thai chars in RTF

	* src/worddef.h (isThai, tis2uni):
	  - Check for Thai more accurately (c > 0xa0, not just c & 0x80).
	* src/filterrtf.cpp (FilterRTF::FilterRTF):
	  - wbr in unicode mode should be escaped, now that UTF-8 interception
	    technique has gone.
	* src/filterrtf.cpp (FilterRTF::GetNextToken):
	  - Distinguish non-Thai in ANSI data bytes.
	* src/filterrtf.cpp (FilterRTF::Print):
	  - Also escape 80..A0 non-Thai chars.
	* src/filterrtf.cpp (PrintEscapedUTF8):
	  - Refactor common code.

2012-06-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean up.

	* src/convutil.h, src/convutil.cpp (-ConvCopy):
	  - Remove unused function.

2012-06-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Use UCS as internal storage for third language character support.

	* src/convutil.h, src/convutil.cpp:
	  - Rewrite ConvGetS(), ConvGetC() to return wchar_t (string).
	  - Add ConvPrint() for printing wchar_t string as given encoding
	    scheme (UTF-8/TIS-620).
	  - Add Ascii2WcsCopy() and Ascii2WcsCat() for transferring between
	    ASCII char/wchar_t arrays.

	* src/wordseg.cpp (main):
	  - Drop UTF-8 -> TIS-620 conversion as preprocessing and replace it
	    with ConvGetS(), which gives wchar_t string.
	  - Replace char arrays with wchar_t ones, along with all manipulations.
	* src/wordseg.cpp (WordSegmentation):
	  - Change args from char* to wchar_t*.
	  - No longer requires isUniOut arg, as it now only works on internal
	    UCS storage. The conversion is moved back to higher layer again.
	* src/wordseg.cpp (SplitToken):
	  - Change args from char* to wchar_t*.

	* src/abswordseg.h (class AbsWordSeg):
	  - Redeclare char -> wchar_t:
	    - WordSeg() arg
	    - 'text' member
	    - IsLeadChar(), IsLastChar(), Has_Karun() args
	* src/abswordseg.cpp (CreateWordList, IsLeadChar, IsLastChar,
	  Has_Karun, WordSeg):
	  - Replace char -> wchar_t, along with manipulations.
	* src/abswordseg.cpp (CreateWordList):
	  - Instead of checking for English with isalpha(), check for
	    !isThaiUni() instead, to cover third language characters.
	* src/abswordseg.cpp (Has_Karun):
	  - Don't bother calculate the THANTHAKHAT position, as it's never used.
	* src/worddef.h (isThaiUniDigit):
	  - Add helper function for checking Thai digits.

	* src/filterx.h (class FilterX):
	* src/filterlatex.h, src/filterhtml.h, src/filterrtf.h:
	  - Redeclare abstract GetNextToken() and Print() args to wchar_t*.

	* src/filterlatex.h (class FilterLatex):
	  - Redeclare 'buffer' member as wchar_t array.
	* src/filterlatex.cpp (FilterLatex::GetNextToken):
	  - Replace char -> wchar_t, along with manipulations.
	* src/filterlatex.cpp (FilterLatex::Print):
	  - Ignore latexFlag case which calls AdjustText(). The shaping is
	    outdated and not worth working on.

	* src/filterhtml.h (class FilterHtml):
	  - Redeclare 'chbuff' member as wchar_t.
	* src/filterhtml.cpp (FilterHtml::GetNextToken, FilterHtml::Print):
	  - Replace char -> wchar_t, along with manipulations.

	* src/filterrtf.h (class FilterRTF):
	  - Redeclare char -> wchar_t:
	    - printNonThai() method arg
	    - strbuff[] member
	* src/filterrtf.cpp (FilterRTF::GetNextToken, FilterRTF::Print):
	  - Replace char -> wchar_t, along with manipulations.
	* src/filterrtf.cpp (FilterRTF::GetNextToken):
	  - Add missing null-terminations on token.
	* src/filterrtf.cpp (FilterRTF::Print):
	  - Change logic from UTF-8 interception to direct UCS manipulation.
	    This also requires pre-calculation of UTF-8 bytes.
	    Helping function added in conv/utf8x.{h,cxx}.
	  - Write UTF-8 skip bytes by the aids of UTF8Writer.
	* conv/utf8.h, conv/utf8.cxx (UTF8Bytes):
	  - Add utility function for pre-calculating UTF-8 bytes used by a UCS.

2012-06-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean up.

	* src/filterhtml.cpp (FilterHtml::GetNextToken):
	  - Check for EOF on every getc, instead of occasionally checking
	    feof().
	  - Rewrite token collecting loop to be more concise.

2012-06-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean up.

	* src/filterx.h, src/filterhtml.h, src/filterhtml.cpp:
	  - Move FilterX::chbuff member to FilterHtml, as it's implementation
	    specific and was only used here. (FilterLatex uses different kind
	    of buffer, for example.)

2012-06-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean up.

	* src/filterhtml.cpp (FilterHtml::GetNextToken):
	  - Drop unnecessary var 'sttoken'.

2012-06-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean up.

	* src/wordseg.cpp (main):
	  - Drop unused vars 'leadch', 'folch'.
	  - Move 'line', 'output' declarations to where they're used.

2012-06-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Rename AbsWordSeg members 'sen', len' to sensible names.

	* src/abswordseg.h (class AbsWordSeg):
	* src/abswordseg.cpp:
	* src/longwordseg.cpp:
	* src/maxwordseg.cpp:
	  - Rename AbsWordSeg::{sen,len} to 'text' and 'textLen' respectively.

2012-06-28  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean up.

	* src/abswordseg.cpp (AbsWordSeg::InitData):
	  - Make variable definition more concise.
	  - Drop unnecessary loop condition.

2012-06-28  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean up.

	* src/abswordseg.h, src/abswordseg.cpp (class AbsWordSeg):
	  - Make LinkSep, IdxSep members array instead of pointer to allocated
	    memory.
	  - Drop unused methods IsNumber(), IsEnglish().
	* src/abswordseg.cpp (AbsWordSeg::CreateWordList):
	  - Remove unused vars Buff, data_idx.
	  - Move variable definitions to where they're used.

2012-06-28  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Print "Output:" title before blank line.

	* src/wordseg.cpp (main):
	  - [plain text] Previous commit introduced title-less output
	    for blank line. Reorder the steps it to fix.

2012-06-28  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean up.

	* src/wordseg.cpp (main):
	  - Replace fgetc() loop with fgets().
	  - [plain text] Also print a blank line in response to blank input.

2012-06-28  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean up.

	* src/wordseg.cpp (main):
	  - Move some vars into code block where they're used.
	  - Fix var name spelling 'thaifag' -> 'thaiFlag'.

2012-06-27  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean up.

	* src/wordseg.cpp (SplitToken):
	  - Drop unused var 'i'.
	  - Rearrange loop condition and make assignments concise.

2012-06-27  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Source clean up.

	* src/wordseg.cpp:
	  - Drop myisspace() and just use libc isspace().
	  - Drop unused functions RemoveJunkChars() and IsValidChar().

2012-06-27  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Make RTFToken methods inline.

	* src/filterrtf.cpp (RTFToken):
	  - Make methods inline functions.

2012-06-27  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Rearrange functions so get related functions closer.

	* src/filterrtf.cpp:
	  - Move chgCharState() to near GetNextToken().

2012-06-26  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Rewrite RTF reader for Unicode support.

	* src/worddef.h (isThaiUni, uni2tis):
	  - Add utility functions for UCS checking & conversion.
	* src/filterrtf.h, src/filterrtf.cpp:
	  - Make chgCharState() member returns token as object, rather than
	    letting the caller does all the jobs.
	  - Add class RTFToken with its manipulation methods.
	  - Rewrite FilterRTF::GetNextToken() based on the extended state
	    machine.
	  - Adjust FilterRTF::Print() behavior, as the new reader now
	    tokenizes more accurately. Some common code has been split into
	    the new printNonThai() method.
	  - Drop the FilterRTF::isThaiChar() method. Now unneeded.

2012-06-25  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Fix RTF Unicode output with proper \ucN.

	* src/filterrtf.h (class FilterRTF):
	  - Add curUTFBytes member to keep track of current UTF-8 bytes.
	* src/filterrtf.cpp (FilterRTF::FilterRTF):
	  - Initialize curUTFBytes = 1.
	  - Initialize Unicode word break with normal UTF-8, to be escaped
	    like ordinary characters.
	* src/filterrtf.cpp (FilterRTF::Print):
	  - Fix UTF-8 escaping loop so it outputs correctly.
	  - Re-initialize curUTFBytes to 1 when entering and leaving scope
	    delimitors '{' and '}'.
	  - Add \ucN before a UTF-8 character and before leaving '}' scope
	     when UTF-8 byte count changes from previous value.

2012-06-25  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Fix lost input chars near EOF.

	* src/filterrtf.cpp (FilterRTF::GetNextToken):
	  - When EOF is found, check for token length before returning true
	    or false, not just blindly returning false.
	  - Drop an unnecessary feof() check.

2012-06-25  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Only escape Thai characters, not ASCII.

	* src/filterrtf.cpp (FilterRTF::Print):
	  - Only escape characters with MSB = 1, i.e. non-ASCII.

2012-06-25  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Use \u8203\'3f as word break for TIS-620 RTF

	* src/filterrtf.cpp (FilterRTF::FilterRTF):
	  - Pass "\u8203\'3f" as word break for TIS-620 output.

2012-06-25  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Escape Unicode characters in UTF text.

	* conv/convkit.h (TextReader):
	  - Add curPos() public metod for accessing current reading pos.
	* src/filterrtf.cpp (FilterRTF::Print):
	  - If output is Unicode, escape each output character in the form
	    "\uDDDD\'XX\'YY\'ZZ" (DDDD = decimal Unicode code point;
	    \xXX\xYY\xZZ = UTF-8 bytes), according to Widhaya Trisarnwadhana.

2012-06-22  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Escape RTF word break.

	* src/filterrtf.cpp (FilterRTF::FilterRTF):
	  - Escape word break text, according to Widhaya Trisarnwadhana.

2012-06-22  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Revert input conversion on FilterRTF.

	* src/filterhtml.cpp (FilterRTF::GetNextToken):
	  - Revert ConvGetC() replacements to fgetc(). RTF stores characters
	    in escaped form, not directly UTF-8.

2012-06-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Add input conversion for FilterRTF.

	* src/filterhtml.cpp (FilterRTF::GetNextToken):
	  - Replace fgetc() calls with ConvGetC().

2012-06-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Add input conversion for FilterHtml.

	* src/convutil.h, src/convutil.cpp:
	  - Add ConvGetC() function resembling fgetc().
	* src/filterhtml.cpp (FilterHtml::GetNextToken):
	  - Replace fgetc() calls with ConvGetC().

2012-06-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Drop MAXCHAR macro and just use MAXLEN.

	* src/wordseg.cpp:
	  - Drop MAXCHAR macro def.
	  - s/MAXCHAR/MAXLEN/g.

2012-06-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Add input conversion for FilterLatex.

	* src/Makefile.am, +src/convutil.h, +src/convutil.cpp:
	  - Add conversion utility functions resembling libc functions.
	  - Move UCopy() from src/wordseg.cpp to ConvCopy() here.
	  - Add ConvGetS() function resembling fgets().
	* src/wordseg.cpp (WordSegmentation):
	  - Replace UCopy() calls with ConvCopy().
	* src/filterlatex.cpp (FilterLatex::GetNextToken):
	  - Replace fgets() calls with ConvGetS().

2012-05-02  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Only convert output text, not the wbr string.

	Let's move the output sentence creation into WordSegmentation().
	This means Unicode moving conversion from main() (upper layer),
	and wbr insertion from AbsWordSeg::WordSeg() (lower layer).

	* src/wordseg.cpp (WordSegmentation):
	  - Add isUniOut parameter.
	  - Also add outputSz to guard against buffer overflow.
	  - Assuming AbsWordSeg::WordSeg() returns sep points array,
	    alternately do text conversion and inserting wbr at the sep points.
	* src/wordseg.cpp (UCopy):
	  - Add the utility function for copying text with conditional
	    conversion.
	* src/wordseg.cpp (main):
	  - Add "isUniOut" and "sizeof output" args to all WordSegmentation()
	    calls.
	  - Replace all UPrint() calls with plain printf(), as no more
	    conversion is needed at this layer.
	  - Adjust loop condition a little bit.
	* src/wordseg.cpp (UPrint):
	  - Drop the now-unused utility function.
	* src/abswordseg.h, src/abswordseg.cpp (WordSeg):
	  - Redefine WordSeg() method to accept outSeps[] array, with size,
	    and return the total members filled.
	* src/abswordseg.h, src/abswordseg.cpp (GetBestSen):
	  - Simply copy the SepData[bestidx] to outSeps[] array, instead of
	    copying the original text and inserting wbr.
	* src/abswordseg.h, src/abswordseg.cpp (GetWord):
	  - Drop the now-unused method.
	* src/abswordseg.h (SwapLinkSep, GetBestSen, CreateWordList,
	  copySepData):
	  - Make methods non-virtual, no need.

2012-05-02  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Null-terminate conversion output string.

	* conv/conv.cxx (conv): Write '\0' to terminate output string.

2012-05-01  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Prepare internal APIs for on-the-fly conversion.

	* src/filefilter.h, src/filefilter.cpp (CreateFilter):
	  - Add isUniIn, isUniOut parameters.
	  - Pass the additional args to filter c-tors.
	* src/filterx.h (FilterX):
	  - Add isUniIn, isUniOut members and stores them via c-tor.
	* src/filterhtml.h, src/filterhtml.cpp (FilterHtml):
	* src/filterlatex.h, src/filterlatex.cpp (FilterLatex):
	* src/filterlambda.h (FilterLambda):
	* src/filterrtf.h, src/filterrtf.cpp (FilterRTF):
	  - Add isUniIn, isUniOut parameters to c-tors and pass them to
	    FilterX base c-tor.
	* src/wordseg.cpp (UPrint):
	  - Add the UPrint utility function which condintally convert
	    encoding before printing.
	* src/wordseg.cpp (main):
	  - Drop old via-tmpfile conversion codes.
	  - Replace tmpin, tmpout with stdin and stdout.
	  - Add bool isUniIn, isUniOut vars.
	  - Pass isUniIn, isUniOut to CreateFilter() for formatted files.
	  - For plain text:
	    - Conditionally convert input text before processing.
	    - Conditionally convert output text chunk by chunk using UPrint().
	    - Never convert wbr text.

2012-04-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	== Begin of spontaneous conversion branch ==
	Adjust convkit API for in-memory conversion.

	* conv/conv.h, conv/conv.cxx:
	 - Change conv() API to accept in-memory storage.
	* conv/Makefile.am, -conv/convkit.cxx:
	 - Move TransferText() code into conv() and remove the function.
	* conv/convkit.h:
	 - Make TextReader & TextWriter include internal text buffer
	   and pointer.
	 - Add getChar(), writeChar(), spaceLeft() methods.
	* conv/tis620.h, conv/tis620.cxx, conv/utf8.h, conv/utf8.cxx:
	 - Adjust concrete classes according to the newer base classes.
	* conv/convfact.h, conv/convfact.cxx:
	 - Adjust the factories according to the newer classes.

2012-06-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in: Post-release version suffix added.

2012-06-13  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in, NEWS:
	=== Version 0.4.3 ===

2012-06-13  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* AUTHORS: Add info for word list source.

2012-06-12  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* data/Makefile.am, +data/tdict-*.txt:
	  - Add dictionary source from dehyphenated text of thailatex
	    hyphenation dictionary source.

	* data/Makefile.am, -data/swathdic.lst:
	  - Drop old dictionary source.
	  - Build the dictionary source by catenating dict files.

2012-06-12  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* data/Makefile.am, +data/update-dict.sh:
	  - Add script for updating dictionary from thailatex hyphenation
	    source.

2012-03-22  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* data/Makefile.am, data/swathdic.lst: Convert word list to UTF-8.

2012-03-22  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in: Post-release version suffix added.

2012-02-08  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in, NEWS:
	=== Version 0.4.2 ===

2012-02-08  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterrtf.cpp (FilterRTF::GetNextToken):
	  - Fix GCC warnings on sscanf format strings.

2012-01-04  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/maxwordseg.cpp (MaxWordSeg::WordSegArea):
	  - The last "wState" use is also temporary, replaced with
	    intermediate object.

2012-01-04  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/maxwordseg.cpp (MaxWordSeg::WordSegArea):
	  - Split out the assignment in if-condition.
	  - Delay the "curState" declaration and merge the steps.

2012-01-04  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/maxwordseg.cpp (MaxWordSeg::WordSegArea):
	  - Declare vars where they are used.
	  - "Idx" var type from short int to int, to match function args it
	    keeps up with.
	  - First "wState" use is temporary, replaced with intermediate object.
	  - "tmps" var eliminated, replaced with immediate assignment.
	  - Merge "looptime" break into while-loop condition.

2012-01-04  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Change MaxWordSeg SepType::Score type from float to int.

	All calculations are already in int, and the casts are just
	unnecessary, and indicates that "float" is never intended.
	Change short int to int to be more machine-native.

	* src/worddef.h:
	  - Change SepType::Score type to int.
	* src/maxwordseg.cpp (MaxWordSeg::WordSegArea):
	  - Change types of "bestScore", "tmps", "score[]" to int.
	  - Drop all (short int) type casts.

2012-01-04  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/maxwordseg.cpp (MaxWordSeg::CreateSentence):
	  - Declare vars where they are used.
	  - Drop unused var "cntArea".
	  - Rearrange source.

2012-01-03  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.cpp (FilterLatex::AdjustText):
	  - Add the missing case of "below vowel + tone mark".

2012-01-03  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.cpp (static data, idxVowelToneMark):
	  - Move static data definitions to near FilterLatex::AdjustText().
	  - Rearrange the data so that 3 sets of characters are handled
	    separately (top marks, above marks, below marks).
	  - Split idxVowelToneMark() into idxTopMark(), idxAboveMark() and
	    idxBelowMark(); make them return -1 on failure, for easy data
	    management.
	  - Replace all 8-bit codes with hex.
	* src/filterlatex.cpp (FilterLatex::AdjustText):
	  - Call the classifiers and use the data under new convention.
	  - Fix bug for the case of below vowels under Do Chada, To Patak.

2012-01-03  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.cpp (FilterLatex::AdjustText):
	  - Guard against buffer overflow on extra char (Sara-Aa).
	  - Change boundary checking condition, to improve the chance of
	    optimization.

2012-01-03  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.cpp (FilterLatex::AdjustText):
	  - Get rid of "chgchar" var by just writing output first and
	    overwriting it later.

2012-01-03  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.h, src/filterlatex.cpp (FilterLatex::AdjustText):
	  - Add "output_sz" arg and check to prevent buffer overflow.
	  - Also return total written bytes for caller to check.
	* src/filterlatex.cpp (FilterLatex::Print):
	  - Adjust call accordingly.

2012-01-03  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.cpp (FilterLatex::AdjustText):
	  - Drop unnecessary vars "tmpInput" and "tmpOutput".

2012-01-02  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Adjust source.

	* src/filterlatex.cpp (FilterLatex::Print):
	  - Use stack array instead of heap allocation for output buffer.
	  - Merge if-cases.

2012-01-02  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Adjust source.

	* src/filterlatex.cpp (FilterLatex::GetNextToken):
	  - Replace hard-coded buffer size 1999 with "sizeof buffer".
	  - Get rid of "stPtr" var, which was set only once to "buffer"
	    and never changed.
	  - Declare vars where they are used.
	  - Adjust brackets & conditions.

2012-01-02  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/maxwordseg.h:
	  - Drop "findAmbArea()" method, which was never defined nor
	    referenced from anywhere.

2012-01-02  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Get rid of temporary member MaxWordSeg::score.

	Its only use was in WordSegArea(), so it can be local var.

	* src/maxwordseg.cpp (MaxWordSeg::WordSegArea):
	  - Declare score as local array (no heap allocation).
	* src/maxwordseg.h, src/maxwordseg.cpp:
	  - Drop "score" new/delete.
	  - Drop now-empty d-tor.

2012-01-02  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Get rid of temporary member MaxWordSeg::sepIdx.

	Its only use was to pass value between CreateSentence() and
	saveSegment() methods. This should be done via function arg instead.

	The other occurrence of "sepIdx" in WordSegArea() is already local
	var, and thus irrelevant.

	* src/maxwordseg.h, src/maxwordseg.cpp (MaxWordSeg::saveSegment):
	  - Add "sepIdx" arg and return its updated value when finished.
	* src/maxwordseg.cpp (MaxWordSeg::CreateSentence):
	  - Declare "sepIdx" as local var.
	  - Update its value from saveSegment() return value.
	* src/maxwordseg.h (class MaxWordSeg):
	  - Drop "sepIdx" member.

2012-01-02  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/maxwordseg.h, src/maxwordseg.cpp (MaxWordSeg::MaxWordSeg):
	  - Drop unused "noAmbArea" member.

2012-01-02  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/maxwordseg.h:
	  - Rearrange member declarations.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.h:
	  - Rearrange member declarations.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/dictpath.h:
	* src/wordstack.h:
	* src/worddef.h:
	* src/abswordseg.h:
	* src/longwordseg.h:
	* src/maxwordseg.h:
	  - Adjust repeated-inclusion-preventing preprocessor style for
	    consistency.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterrtf.cpp (FilterRTF::FilterRTF):
	  - Initialize members with initialization list where possible.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterrtf.h:
	  - More member declaration rearrangement.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterhtml.h, src/filterhtml.cpp:
	* src/filterrtf.h, src/filterrtf.cpp:
	  - Rearrange member declarations.
	  - Drop the empty d-tor.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.h, src/filterlatex.cpp:
	  - Drop the empty d-tor.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.cpp (FilterLatex::FilterLatex):
	  - Initialize members with initialization list where possible.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.h:
	  - Rearrange member declarations.
	  - Make AdjustText() private.
	  - Make winCharSet, latexFlag private.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.h, src/filterlatex.cpp:
	  - Make idxVowelToneMark() non-member, as it depends on file-internal
	    static vars.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.h, src/worddef.h, src/filterlatex.cpp:
	  - Move isLongTailChar() to global worddef.h. Rename it to
	    isThaiLongTailChar().
	  - Adjust all calls in filterlatex.cpp accordingly.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Make FilterX::wordBreakStr private and initialized by c-tor.

	* src/filterx.h:
	  - Make FilterX::wordBreakStr private.
	  - Add wordBreakStr arg to c-tor.
	* src/filterlatex.h:
	  - Add wordBreakStr arg to c-tor with default value for LaTeX,
	    so it can be overridden by FilterLambda.
	* src/filterlatex.cpp (FilterLatex::FilterLatex):
	  - Pass the wordBreakStr to FilterX c-tor instead of direct
	    assignment.
	* src/filterlambda.h (FilterLambda::FilterLambda):
	  - Pass Lambda word break string to FilterLatex c-tor instead of
	    direct assignment.
	* src/filterrtf.cpp (FilterRTF::FilterRTF):
	* src/filterhtml.cpp (FilterHtml::FilterHtml):
	  - Pass specific word break string to FilterX c-tor instead of
	    direct assignment.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterrtf.h, src/filterrtf.cpp:
	  - Apply "indent -nut" + manual adjustments.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlambda.h, -src/filterlambda.cpp, src/Makefile.am:
	  - Apply "indent -nut" + manual adjustments.
	  - Move FilterLambda member functions to header and make them inline.
	  - Drop empty d-tor.
	  - Remove now-unused filterlambda.h.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.h:
	* src/abswordseg.cpp:
	* src/wordseg.cpp:
	* src/filterx.h:
	* src/filterlatex.h:
	* src/filterlatex.cpp:
	* src/filefilter.h:
	* src/filefilter.cpp:
	* src/maxwordseg.h:
	  - Change pointer declarations to C++ style.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterhtml.h, src/filterhtml.cpp:
	  - Apply "indent -nut" + manual adjustments.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.cpp (FilterLatex::Print):
	  - Remove unused var "lwbr".

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.h, src/filterlatex.cpp:
	  - Apply "indent -nut" + manual adjustments.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filefilter.h, src/filefilter.cpp:
	  - Apply "indent -nut" + manual adjustments.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterx.h, -src/filterx.cpp, src/Makefile.am:
	  - Drop unused member "fileopen".
	  - Adjust c-tor to use initialization list; also initialize
	    wordBreakStr to a default value.
	  - Move all FilterX member functions to header and make them inline.
	  - Drop now-unneeded filterx.cpp.

2011-12-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterx.cpp, src/filterx.h:
	  - Apply "indent -nut" + manual adjustments.

2011-12-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordstack.h:
	  - Add preprocessor to prevent duplicated inclusion.

2011-12-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordstack.h:
	  - Apply "indent -nut" + manual adjustments.

2011-12-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Move tis2uni() and isthaidigit() to worddef.h

	* src/worddef.h:
	  - Define tis2uni() & isThaiDigit() inline functions here.
	* src/abswordseg.cpp:
	  - Drop old macros.
	  - Include worddef.h.
	  - Replace isthaidigit() calls with isThaiDigit().

2011-12-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Replace some internal char classifiers with <ctype.h>

	* src/worddef.h, src/filterrtf.cpp, src/filterhtml.cpp:
	  - Drop isSpace(), use isspace() instead.
	  - Drop isHex(), use isxdigit() instead.

2011-12-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/worddef.h:
	  - Apply "indent -nut" + manual adjustments.

2011-12-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp:
	  - Declare internal functions static.
	  - Simplify myyisspace() and make it bool.
	  - Simplify RemoveJunkChars() and invert IsJunkChar() to
	    IsValidChar().

2011-12-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp:
	  - Apply "indent -nut" + manual adjustments.

2011-12-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/maxwordseg.cpp (MaxWordSeg::MaxWordSeg):
	  - Reformat c-tor.

2011-12-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/longwordseg.h, src/longwordseg.cpp:
	  - Apply "indent -nut" + manual adjustments.

2011-12-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.h:
	  - Remove obfuscated comment after last #endif.

2011-12-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/maxwordseg.h, src/maxwordseg.cpp:
	  - Apply "indent -nut" + manual adjustments.

2011-12-30  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.h, src/abswordseg.cpp:
	  - Reformat source as GNU style using "indent -nut" plus manual
	    adjustments.

2011-12-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList):
	  - Adjust loops.

2011-12-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList):
	  - Tokenize punctuation marks in chunks, not in characters,
	    to prevent breaking word in between, fixing problem with `` or ''
	    at line beginning in LaTeX. This should also be true in general.

2011-12-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/maxwordseg.cpp (MaxWordSeg::WordSegArea):
	  - Reformat source a little bit.

2011-12-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::SwapLinkSep, AbsWordSeg::GetBestSen):
	  - Reformat source a little bit.

2011-12-29  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList):
	  - Reformat source a little bit.

2011-12-28  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList):
	  - Guard against i being 0 before accessing sen[i-1].

2011-12-28  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::IsNumber, AbsWordSeg::IsEnglish):
	  - Replace ASCII range checks with ctype.h functions.

2011-12-28  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList):
	  - Replace ASCII range checks with ctype.h functions.
	  - Replace TIS-620 range check with macro function.

2011-12-07  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in: Post-release version suffix added.
	* data/swathdic.lst: Remove a typo entry.

2011-03-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in, NEWS:
	=== Version 0.4.1 ===

2011-03-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in: Add AC_CONFIG_MACRO_DIR, as suggested by libtoolize.
	* Makefile.am: Add ACLOCAL_AMFLAGS, as suggested by libtoolize.

2011-03-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in, -swath.spec.in: Remove the outdated, unmaintained file.

2011-03-19  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.h (AbsWordSeg::Construct):
	  - Remove the remaining declaration of the removed method.

2011-03-19  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::IsLeadChar, AbsWordSeg::IsLastChar):
	  - Simplify the methods.
	  - For IsLeadChar(), also exclude LAKKHANGYAO.
	* src/abswordseg.h, src/abswordseg.cpp
	  (AbsWordSeg::IsNumber, AbsWordSeg::IsEnglish, AbsWordSeg::Has_Karun):
	  - Make the methods accept const pointer as input.

2011-03-19  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Make class WordStack methods inline.

	* src/worddef.h (wordStateType):
	  - Simply name the struct type "wordState" and remove typedef'd type.
	  - Add c-tors for instantiation convenience.
	* src/wordstack.h (class WordStack):
	  - Replace "wordStateType" type with "wordState".
	  - Make all methods inline. They are all short functions.
	* src/Makefile.am, -src/wordstack.cpp:
	  - Remove the now-unused source file.
	* src/maxwordseg.cpp (MaxWordSeg::WordSegArea):
	* src/longwordseg.cpp (LongWordSeg::CreateSentence):
	  - Replace "wordStateType" type with "wordState".

2011-03-19  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Change the dictionary search order & mechanism.

	* src/swath.1: Document the new behavior.
	* src/wordseg.cpp (InitWordSegmentation):
	  - Implement the search order here instead of in main, so the
	    wordseg object is created and destroyed only once.
	  - WORDSEGDATA env is now changed to SWATHDICT.
	* src/wordseg.cpp (main):
	  - Call InitWordSegmentation() only once and let the multiple
	    trials happen there.

2011-03-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::AbsWordSeg):
	* src/wordseg.cpp (main):
	  - Print errors on stderr, not stdout.

2011-03-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp (main): Rename 'wsegpath' var to 'dictpath'.

2011-03-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Enhance -d option to also accept a trie file.

	* src/abswordseg.h, src/abswordseg.cpp
	  (AbsWordSeg::AbsWordSeg, +AbsWordSeg::InitDict):
	  - Make dictionary opening be a normal method to be called after
	    construction, so the return state can be checked.
	  - Check if the path is a regular file or directory. If it's a
	    directory, append usual dict file name to it before opening.
	* src/maxwordseg.h, src/maxwordseg.cpp:
	* src/longwordseg.h, src/longwordseg.cpp:
	  - Remove c-tor with dict path arg.
	* src/wordseg.cpp (InitWordSegmentation):
	  - Don't try opening file. Just create the wordseg object and
	    call InitDict() to determine successfulness.
	* src/swath.1: Update man page to cover the new behavior, with example.

2011-03-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp (InitWordSegmentation):
	  - Use sprintf instead of series of strcat to format trie path name.

2011-03-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterx.h, src/filterx.cpp
	  (GetSuffix, GetPrefix, suffixStr, prefixStr):
	* src/filterrtf.cpp (FilterRTF::FilterRTF):
	* src/filterhtml.cpp (FilterHtml::FilterHtml):
	* src/filterlatex.cpp (FilterLatex::FilterLatex):
	  - Remove unused members & methods.

	* src/filterx.h (wordBreakStr)
	* src/filterrtf.cpp (FilterRTF::FilterRTF):
	* src/filterhtml.cpp (FilterHtml::FilterHtml):
	* src/filterlatex.cpp (FilterLatex::FilterLatex):
	* src/filterlambda.cpp (FilterLatex::FilterLatex):
	  - Make 'wordBreakStr' member (const char *) to string literal.

2011-03-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::AbsWordSeg):
	Use macro D2TRIE for trie file name instead of hard literal.

2011-03-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.h, src/abswordseg.cpp
	  (AbsWordSeg, AbsWordSeg::AbsWordSeg, AbsWordSeg::InitData):
	  - Remove unused 'cntSep' member.

2011-03-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Move dict initialization to AbsWordSeg base class.

	* src/abswordseg.h (MyDict, InitData):
	  - Make the members private.
	* src/abswordseg.h, src/abswordseg.cpp
	  (AbsWordSeg::AbsWordSeg, +Construct, AbsWordSeg::~AbsWordSeg):
	* src/longwordseg.cpp (LongWordSeg::LongWordSeg):
	* src/maxwordseg.cpp (MaxWordSeg::MaxWordSeg):
	  - Move MyDict c-tor & d-tor code from derived class to base class.
	  - Add base class c-tor which accepts dataPath arg, and split common
	    c-tor code into AbsWordSeg::Construct().
	* src/longwordseg.h, src/longwordseg.cpp (-LongWordSeg::~LongWordSeg):
	  - Remove the now-empty d-tor.

2011-03-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp (Usage): Print usage on stderr, not stdout.

2011-03-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp (main): Make 'wsegpath', 'method', 'fileformat',
	'unicode' (const char *).

2011-03-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Make 'wbr' word delimiter a (const char *).

	* src/abswordseg.h, src/abswordseg.cpp
	  (AbsWordSeg::WordSeg, AbsWordSeg::GetBestSen):
	  - Declare 'wbr' arg (const char *).
	* src/filterx.h, src/filterx.cpp (FilterX::GetWordBreak):
	  - Make it return (const char *) instead of copying string back.
	* src/wordseg.cpp (WordSegmentation):
	  - Declare 'wbr' arg (const char *).
	* src/wordseg.cpp (main):
	  - Make 'wbr' (const char *).
	  - Replace all allocs & copies with pointer assignments.
	  - Remove all deletes.

2011-03-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/swath.1: Update doc for -d option. The dictionary file is now
	a single `swathdic.tri' file.

2011-03-17  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Make check functions in worddef.cpp inline.

	* src/Makefile.am, src/worddef.h, -src/worddef.cpp
	(isSpace, isHex, isPunc, isThai):
	  - Move function definitions from worddef.cpp to worddef.h and make
	    them inline.
	  - Remove worddef.cpp from the project.

2011-03-17  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Replace hard checks for Thainess with new function isThai().

	* src/worddef.h, src/worddef.cpp (+isThai): Add the check function.
	* src/filterlatex.cpp (FilterLatex::GetNextToken): Replace hard
	checks (c & 0x80), or (c > 0), with isThai(c) calls.

2011-03-17  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.cpp (FilterLatex::GetNextToken):
	  - Replace partial copy using temporary null-termination with
	    strncpy() and strncat().
	  - Replace a risky strcpy() on overlapping buffers with memmove().
	  - Split an assignment in if-condition into sepearate statement,
	    for more readability.

2011-03-17  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/worddef.cpp (isSpace, isHex, isPunc):
	  - Replace decimal ASCII codes with char literals, for more
	    readability
	  - Simplify return expressions

2010-09-24  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::IsLeadChar): Also include PHINTHU,
	NIKHAHIT and YAMAKKAN in non-leading char list.

2009-04-18  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in: Post-release version suffix added.
	* src/wordseg.cpp (main, +Version): Add -V/--version to print version
	info, as suggested by Beamer User.
	* src/Makefile.am: Pass -DVERSION CFLAGS.

2009-04-08  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in, NEWS:
	=== Version 0.4.0 ===

2009-04-08  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterrtf.h (FilterRTF::chgCharState, FilterRTF::isThaiChar):
	Declare utility functions as static members.

2009-04-08  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filterlatex.cpp (WinMacNormal, MacOffsetLeft, MacOffsetLeftHigh,
	MacOffsetNormal, WinOffsetLeft, WinOffsetLeftHigh, WinOffsetNormal):
	Declare internal data as file-scoped.

	* src/filterlatex.h
	(FilterLatex::isLongTailChar, FilterLatex::idxVowelToneMark):
	Declare utility functions as static members.

	* src/filterlatex.cpp (FilterLatex::isLongTailChar):
	Eliminate unnecessary true/false conditional expression; simple boolean
	expression is enough.

	* src/filterlatex.cpp (FilterLatex::idxVowelToneMark):
	Check array index range before accessing, rather than after.

2009-04-08  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/maxwordseg.cpp (MaxWordSeg::CreateSentence):
	Check array index range before accessing, rather than after.

2009-04-07  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/longwordseg.cpp (LongWordSeg::LongWordSeg):
	Use 'delete[]' instead of 'delete', to match with 'new []', fixing
	valgrind warning.

	* src/longwordseg.cpp (LongWordSeg::CreateSentence):
	Check array index range before accessing, rather than after, fixing
	valgrind warnings of conditional jumps depending on uninitialized
	values.

2009-04-07  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/maxwordseg.cpp
	(MaxWordSeg::CreateSentence, MaxWordSeg::WordSegArea):
	Check range of array index for IdxSep[] *before* accessing the array
	element, rather than *after*, fixing valgrind warnings of conditional
	jumps depending on uninitialized values.

2009-04-07  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList):
	* src/filterhtml.cpp (FilterHtml::FilterHtml):
	* src/filterrtf.cpp (FilterRTF::FilterRTF, FilterRTF::GetNextToken):
	* src/filterlatex.cpp
	  (FilterLatex::FilterLatex, FilterLatex::GetNextToken):
	* src/wordseg.cpp (main):
	Replace strcpy() and strcmp() calls with null string arguments with
	first character accesses.

	* src/filterlatex.cpp (FilterLatex::GetNextToken):
	Replace operlapping strcpy() with memmove(), fixing valgrind warnings.

2009-04-06  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp (main): Move global vars 'startStr', 'buff', 'gout'
	to local scope, where they are actually used.

2009-04-06  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp (main): Move global var 'mulestr' to local scope.
	Make it just bool 'muleMode', as all its function is just that.

2009-04-06  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp (InitWordSegmentation, main):
	Make the global var 'method' local and passed as argument.

2009-04-06  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.h (AbsWordSeg::~AbsWordSeg): Declare d-tor as virtual,
	fixing memory leak because its derived class d-tor was not called.
	Thanks valgrind.

	* src/wordseg.cpp (InitWordSegmentation, main): Move 'delete method' to
	main(), where it's more obvious.

	* src/wordseg.cpp (ExitWordSegmentation, WordSegmentation, main):
	Make ExitWordSegmentation() and WordSegmentation() accept mere
	AbsWordSeg pointer, rather than pointer to pointer, reducing
	dereferencing steps.

2009-04-06  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Fix valgrind warnings about mismatched new/delete.

	* src/abswordseg.cpp (AbsWordSeg::~AbsWordSeg):
	* src/maxwordseg.cpp (MaxWordSeg::MaxWordSeg, MaxWordSeg::~MaxWordSeg):
	* src/wordseg.cpp (InitWordSegmentation, main):
	Replace 'delete' with 'delete[]' where data was created with new[].

2009-04-06  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* AUTHORS: Update info, as the trie supporting code has been removed.
	Now my role is general maintenance. And describe Phaisarn's role as
	the original creator.

2009-04-06  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::AbsWordSeg):
	Replace C malloc() calls with C++ 'new' operator, to better match with
	the 'delete' operator in destructor.

2009-04-06  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.h (AbsWordSeg::Has_Karun):
	Declare another utility function as static member.
	* src/abswordseg.cpp (AbsWordSeg::Has_Karun):
	Replace negative number literal with hexadecimal code, for readability.

2008-12-17  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList):
	Remove unnecessary 'continue'.

2008-12-17  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.h
	(AbsWordSeg::IsLeadChar, IsLastChar, IsNumber, IsEnglish):
	Declare utility functions as static members.
	* src/abswordseg.cpp (AbsWordSeg::IsNumber, IsEnglish):
	Replace code with simpler equivalence.
	* src/abswordseg.cpp (AbsWordSeg::GetBestSen):
	Get rid of unnecessary assignment and strcat().

2008-12-17  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Get rid of the unnecessary src/dictpath.cpp.

	* src/Makefile.am, -src/dictpath.cpp: Remove dictpath.cpp.
	* src/dictpath.h (d2triepath), src/wordseg.cpp (InitWordSegmentation):
	Get rid of the unnecessary global variable 'd2triepath', which is in
	fact only needed locally.

2008-12-17  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Switch to libdatrie. (Requires libdatrie >= 0.1.99.2)

	* configure.in: Post-release version bump.

	* configure.in: Check for 'trietool-0.2' program (under --enable-dict)
	and 'datrie-0.2' pkg-config.
	* configure.in, Makefile.am: Exclude 'misc', 'vmem', and 'trie' subdirs.

	* src/abswordseg.h:
	  - Replace 'AbsWordSeg::MyDict' member with libdatrie's 'Trie'
	* src/abswordseg.cpp (AbsWordSeg::AbsWordSeg):
	  - Don't delete 'MyDict' in c-tor
	* src/abswordseg.cpp (AbsWordSeg::CreateWordList):
	  - Replace calls to old 'Trie' class with corresponding libdatrie
	    functions
	  - Adjust loop so that terminating '\0' is not walked

	* src/longwordseg.h, src/longwordseg.cpp
	  (LongWordSeg::LongWordSeg, ~LongWordSeg):
	  - Replace 'MyDict' creation/deletion with libdatrie's functions
	  - Remove 'branchPath' and 'tailPath' members; and just use local
	    vars in c-tors instead

	* src/maxwordseg.h, src/maxwordseg.cpp
	  (MaxWordSeg::MaxWordSeg, ~MaxWordSeg):
	  - Replace 'MyDict' creation/deletion with libdatrie's functions
	  - Remove 'branchPath' and 'tailPath' members; and just use local
	    vars in c-tors instead

	* src/dictpath.h, src/dictpath.cpp:
	* src/wordseg.cpp (InitWordSegmentation):
	  - Replace 'd2branchpath' and 'd2tailpath' with a single
	    'd2triepath' variable
	  - Replace 'D2BRANCH' and 'D2TAIL' with a single 'D2TRIE' macro

	* src/wordstack.h:
	  - Remove unneeded #include "misc/typedefs.h"

	* src/Makefile.am:
	  - Remove linkages to internal 'libmisc', 'libvmem', and 'libtrie'
	  - Add libdatrie CFLAGS and LIBS

	* data/Makefile.am, -data/swathdic.br, -data/swathdic.tl:
	  - Exclude 'swathdic.br' and 'swathdic.tl' from distribution
	* data/Makefile.am, +data/swathdic.abm:
	  - Add 'swathdic.abm' to distribution
	  - Add rule for generating 'swathdic.tri' with trietool-0.2
	  - Install 'swathdic.tri' instead of 'swathdic.br' and 'swathdic.tl'

2008-04-06  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in, NEWS:
	=== Version 0.3.4 ===

2008-03-24  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* data/Makefile.am, +data/swathdic.lst: Add dictionary word list
	dumped from the dict binary files, for dict adjustments in the future.

2008-03-23  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Use tmpfile() instead of tmpnam() when creating temp files,
	to avoid race condition as a security measurement.

	* src/wordseg.cpp (main):
	  - Use FILE* instead of file names for temp files
	  - Call tmpfile() to create temp files
	  - Pass FILE* to conv() and CreateFileFilter()

	* conv/conv.{h,cxx}:
	  - Add overloaded conv() accepting FILE* arguments
	  - Refactor do_conv() out of conv() wrappers
	  - Pass FILE* to CreateText{Reader,Writer}
	* conv/convfact.{h,cxx} (CreateTextReader, CreateTextWriter):
	  - Accept FILE* arguments instead of istream, ostream
	  - Pass FILE* arguments to {TIS620,UTF8}{Reader,Writer} c-tors
	* conv/{tis620,utf8}.{h,cxx}:
	  - Change stream members' type to FILE*
	  - Use fgetc() and fputc() for character I/O
	  - Declare internal functions static

	* src/filefilter.{h,cpp} (CreateFileFilter):
	  - Use FILE* arguments instead of file names
	  - Pass the FILE* arguments to Filter* c-tors
	* src/filter{html,latex,lambda,rtf}.{h,cpp}:
	  - Accept FILE* arguments instead of file names in c-tors
	  - Pass the FILE* arguments to FilterX base class c-tor
	* src/filterx.{h,cpp}:
	  - In c-tor, assign FILE* arguments to members directly, rather
	    than creating new files from file names
	  - In d-tor, just flush output, rather than closing files

2008-03-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/swath.1: Escape more minus signs. [lintian]

2008-03-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in, NEWS:
	=== Version 0.3.3 ===

2008-03-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp (main): Move FltX variable into if-block. Beutify
	some indents.

2008-03-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/filefilter.{h,cpp}: Make "FileFilter" an empty class with a
	single static method CreateFilter(). The full-fledged
	{con,de}structors are just unnecessary.

	* src/wordseg.cpp (main): Create FilterX with the static method.
	Remove now-unneeded FileFilter variable.

2008-03-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp (main): Delete filter X object after finishing
	wordseg, so output file gets flushed.

2008-03-20  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp (main): duplicate tmpnam() results, instead of mere
	pointer assignment, as the returned value is pointer to static buffer,
	resulting in the same names for tmpin and tmpout. Fixing
	non-functional '-u u,u' option. Bug report by Neutron Soutmun.

2008-03-19  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.cpp (AbsWordSeg::CreateWordList): Fix logical errors
	introduced during portability fix, which made swath not break any
	word. Bug report by Pisut Tempatarachoke.

2008-02-07  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/swath.1: Escape minus signs. Thanks debian's lintian.

2008-02-02  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in, NEWS:
	=== Version 0.3.2 ===

2008-02-02  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in: Remove unused ISODATE var. Remove checks for CC and
	CPP. Just CXX is enough.

2008-02-01  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	Reveal the encoding conversion feature to users.

	* src/wordseg.cpp (Usage): Add '-u' option in help message.
	* src/swath.1: Add documentation for '-u' option, with example.
	* README: Mention the '-u' option for UTF-8 LaTeX files. Indent the
	sample codes for readability.

	* src/swath.1: Document the default matching scheme, and adjust the
	example accordingly.

2008-02-01  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp (main):
	  - Also accept '--verbose' and '--help' options.
	  - Adjust indents around the code, for readability.
	  - Remove unnecessary continue's.
	  - Check boundary of argc for '-u' parsing.
	  - Free more allocated data (fileformat, method, unicode) on return
	    after printing usage.

2008-01-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/wordseg.cpp (Usage): Revise wording for the help message.
	Use string catenation instead of separate printf's.

2008-01-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* README: Write document.
	* src/swath.1: Rewrite the whole page, with more detailed info.

2008-01-31  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* data/Makefile.am: Install dict in ${pkgdatadir}, not ${datadir}.
	* src/Makefile.am: Update dict location macro acoordingly.

2006-07-03  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/abswordseg.{h,cpp} (IsLeadChar(), IsLastChar()),
	src/filterhtml.cpp (GetNextToken()),
	src/filterlatex.cpp (GetNextToken()),
	src/filterrtf.cpp (GetNextToken()): Fixed char signedness portability
	issues (found on s390, powerpc, arm builds by debian buildd).

2006-03-28  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* configure.in, NEWS:
	=== Version 0.3.1 ===

2006-03-27  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* src/swath.1: Used section number instead of version number.

2006-03-26  Theppitak Karoonboonyanan  <thep@linux.thai.net>

	* Makefile.am: Removed debian from SUBDIRS.

	* configure.in: Removed debian/Makefile generation.

2005-10-09  Theppitak Karoonboonyanan <thep@linux.thai.net>

	* configure.in:
	Formatted configure options help strings with AC_HELP_STRING().
	Used --disable/--enable help style rather than --enable with default
	yes or no. Also disabled debug by default.

2005-05-07  Chanop Silpa-Anan <chanop@debian.org>
	* src/abswordseg.cpp:
	A quick hack for Apple/Darwin: malloc is defined in stdlib.h instead
	of a more common place malloc.h.

2004-03-30  Theppitak Karoonboonyanan <thep@linux.thai.net>

	* AUTHORS:
	Fix my e-mail address.

2003-04-04  Theppitak Karoonboonyanan <thep@linux.thai.net>

	* conv/tis620.cxx, conv/utf8.cxx:
	Use casting instead of declaring temp vars in dealing with
	iostream::get() with unsigned char argument.

2003-04-03  Chanop Silpa-Anan <chanop@debian.org>
	* conv/{conv.cxx conv.h convfact.cxx convfact.h tis620.cxx tis620.h
	  utf8.cxx utf8.}:
	Clean up for g++-3.2: compilation errors, compiler warnings
	and namespace issues.

	* trie/{trie.h trie.cxx}:
	Clean up for g++-3.2: compilation errors. Use strict
	ios_base::openmodes for OpenModes instead of int previously allowed by
	prior compilers.

	* vmem/{dataheap.cxx dataheap.h vmem.cxx vmem.h}:
	Clean up for g++-3.2: compilation errors. Use strict
	ios_base::openmodes for OpenModes instead of int previously allowed by
	prior compilers. Also use namespace std in .h files, a quick hack.

2003-01-14  Theppitak Karoonboonyanan <thep@linux.thai.net>

	* swath.spec.in:
	Fix "%install" mess in comment (rpmbuild oddity)

2002-09-24  Theppitak Karoonboonyanan <thep@linux.thai.net>

	* src/wordseg.cpp:
	Fix segfault in case of unknown file format.
	Nicer "Usage:" handling.
	Remove winlatex, maclatex from Usage:

2002-09-23  Theppitak Karoonboonyanan <thep@linux.thai.net>

	* configure.in:
	Add --enable-debug to allow assertions disabling.

	* configure.in, src/filterlatex.cpp:
	Add --enable-catthai to allow Thai line catenation disabling.
	(temporary solution, may be replaced with command-line option or
	hard-coding later)

2002-09-21  Theppitak Karoonboonyanan <thep@linux.thai.net>

	* configure.in:
	Add missing debian/Makefile in AC_OUTPUT.

2001-12-21  Theppitak Karoonboonyanan <thep@links.nectec.or.th>

	* GNU autotools files:
	Rearrange source tree and apply GNU autotools.

	* Version 0.3.0.

