diff options
| author | Mohammad Amoush <47069173+mamoush34@users.noreply.github.com> | 2020-01-19 15:15:53 +0300 |
|---|---|---|
| committer | Mohammad Amoush <47069173+mamoush34@users.noreply.github.com> | 2020-01-19 15:15:53 +0300 |
| commit | 7683e1fbb53fe683c0d04e537d89fb53d768e852 (patch) | |
| tree | d81eebcd5a129550a49fdfc852b8bb6220907a1a /solr-8.3.1/contrib/langid | |
| parent | f4382d73eec75f7d7f4bfe6eae3fb1efa128a021 (diff) | |
| parent | aff9cc02750eb032ade98d77cf9ff45677063fc8 (diff) | |
Merge branch 'master' of https://github.com/browngraphicslab/Dash-Web into webcam_mohammad
Diffstat (limited to 'solr-8.3.1/contrib/langid')
| -rw-r--r-- | solr-8.3.1/contrib/langid/README.txt | 22 | ||||
| -rw-r--r-- | solr-8.3.1/contrib/langid/lib/jsonic-1.2.7.jar | bin | 0 -> 147477 bytes | |||
| -rw-r--r-- | solr-8.3.1/contrib/langid/lib/langdetect-1.1-20120112.jar | bin | 0 -> 1236033 bytes | |||
| -rw-r--r-- | solr-8.3.1/contrib/langid/lib/opennlp-tools-1.9.1.jar | bin | 0 -> 1248314 bytes |
4 files changed, 22 insertions, 0 deletions
diff --git a/solr-8.3.1/contrib/langid/README.txt b/solr-8.3.1/contrib/langid/README.txt new file mode 100644 index 000000000..68a2ea58c --- /dev/null +++ b/solr-8.3.1/contrib/langid/README.txt @@ -0,0 +1,22 @@ +Apache Solr Language Identifier + + +Introduction +------------ +This module is intended to be used while indexing documents. +It is implemented as an UpdateProcessor to be placed in an UpdateChain. +Its purpose is to identify language from documents and tag the document with language code. +The module can optionally map field names to their language specific counterpart, +e.g. if the input is "title" and language is detected as "en", map to "title_en". +Language may be detected globally for the document, and/or individually per field. +Language detector implementations are pluggable. + +Getting Started +--------------- +Please refer to the module documentation at http://wiki.apache.org/solr/LanguageDetection + +Dependencies +------------ +The Tika detector depends on Tika Core (which is part of extraction contrib) +The Langdetect detector depends on LangDetect library +The OpenNLP detector depends on OpenNLP tools and requires a previously trained user-supplied model diff --git a/solr-8.3.1/contrib/langid/lib/jsonic-1.2.7.jar b/solr-8.3.1/contrib/langid/lib/jsonic-1.2.7.jar Binary files differnew file mode 100644 index 000000000..11fcfd4ba --- /dev/null +++ b/solr-8.3.1/contrib/langid/lib/jsonic-1.2.7.jar diff --git a/solr-8.3.1/contrib/langid/lib/langdetect-1.1-20120112.jar b/solr-8.3.1/contrib/langid/lib/langdetect-1.1-20120112.jar Binary files differnew file mode 100644 index 000000000..2e7a9cf36 --- /dev/null +++ b/solr-8.3.1/contrib/langid/lib/langdetect-1.1-20120112.jar diff --git a/solr-8.3.1/contrib/langid/lib/opennlp-tools-1.9.1.jar b/solr-8.3.1/contrib/langid/lib/opennlp-tools-1.9.1.jar Binary files differnew file mode 100644 index 000000000..cb7b031c8 --- /dev/null +++ b/solr-8.3.1/contrib/langid/lib/opennlp-tools-1.9.1.jar |
