Uploaded image for project: 'ModeShape'
  1. ModeShape
  2. MODE-841

File content is not being extracted and included in the search indexes

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Critical
    • 2.5.0.Beta1
    • 2.0.0.Final
    • None
    • None

    Description

      When a file is uploaded into the repository (or projected via the file system connector), the "jcr:data" binary value of the "jcr:content" node is being ignored by the Lucene search engine. The content of the file should be run through a text extractor (such as Tika) to pull out the terms that should be indexed.

      Attachments

        Issue Links

          Activity

            People

              rhauch Randall Hauch (Inactive)
              rhauch Randall Hauch (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: