aboutsummaryrefslogtreecommitdiff
path: root/solr-8.1.1/contrib/langid
diff options
context:
space:
mode:
authorbob <bcz@cs.brown.edu>2019-08-19 10:11:59 -0400
committerbob <bcz@cs.brown.edu>2019-08-19 10:11:59 -0400
commite37bf9124c952aa26c3e29deb9e4faa01cad1a7e (patch)
treebe44ae9bd5e2eb6c5ce392383d41505b5863d061 /solr-8.1.1/contrib/langid
parent07482c3bf435748140addfd4fd338fc668657798 (diff)
parentb037aa89fb564812f880994453ce002054a0ad82 (diff)
Merge branch 'master' into presentation_f
Diffstat (limited to 'solr-8.1.1/contrib/langid')
-rw-r--r--solr-8.1.1/contrib/langid/README.txt22
-rw-r--r--solr-8.1.1/contrib/langid/lib/jsonic-1.2.7.jarbin0 -> 147477 bytes
-rw-r--r--solr-8.1.1/contrib/langid/lib/langdetect-1.1-20120112.jarbin0 -> 1236033 bytes
-rw-r--r--solr-8.1.1/contrib/langid/lib/opennlp-tools-1.9.1.jarbin0 -> 1248314 bytes
4 files changed, 22 insertions, 0 deletions
diff --git a/solr-8.1.1/contrib/langid/README.txt b/solr-8.1.1/contrib/langid/README.txt
new file mode 100644
index 000000000..68a2ea58c
--- /dev/null
+++ b/solr-8.1.1/contrib/langid/README.txt
@@ -0,0 +1,22 @@
+Apache Solr Language Identifier
+
+
+Introduction
+------------
+This module is intended to be used while indexing documents.
+It is implemented as an UpdateProcessor to be placed in an UpdateChain.
+Its purpose is to identify language from documents and tag the document with language code.
+The module can optionally map field names to their language specific counterpart,
+e.g. if the input is "title" and language is detected as "en", map to "title_en".
+Language may be detected globally for the document, and/or individually per field.
+Language detector implementations are pluggable.
+
+Getting Started
+---------------
+Please refer to the module documentation at http://wiki.apache.org/solr/LanguageDetection
+
+Dependencies
+------------
+The Tika detector depends on Tika Core (which is part of extraction contrib)
+The Langdetect detector depends on LangDetect library
+The OpenNLP detector depends on OpenNLP tools and requires a previously trained user-supplied model
diff --git a/solr-8.1.1/contrib/langid/lib/jsonic-1.2.7.jar b/solr-8.1.1/contrib/langid/lib/jsonic-1.2.7.jar
new file mode 100644
index 000000000..11fcfd4ba
--- /dev/null
+++ b/solr-8.1.1/contrib/langid/lib/jsonic-1.2.7.jar
Binary files differ
diff --git a/solr-8.1.1/contrib/langid/lib/langdetect-1.1-20120112.jar b/solr-8.1.1/contrib/langid/lib/langdetect-1.1-20120112.jar
new file mode 100644
index 000000000..2e7a9cf36
--- /dev/null
+++ b/solr-8.1.1/contrib/langid/lib/langdetect-1.1-20120112.jar
Binary files differ
diff --git a/solr-8.1.1/contrib/langid/lib/opennlp-tools-1.9.1.jar b/solr-8.1.1/contrib/langid/lib/opennlp-tools-1.9.1.jar
new file mode 100644
index 000000000..cb7b031c8
--- /dev/null
+++ b/solr-8.1.1/contrib/langid/lib/opennlp-tools-1.9.1.jar
Binary files differ