Before using the service, please read the preliminary information containing a description of steps that enable access to the CLARIN-PL developer interface.
Any2txt is a service that converts a file containing text to text. It uses the Apache Tika package.
Note:
If the input data is imported as a file, the service should be used as the first one in the processing pipeline.
Any2txt can be run by using an LPMN query in the LPMN Client service:
No parameters.
Any2txt can be run in the Windows system with default values using the following LPMN query: ['any2txt']
.
[['any2txt']]
- input data in the form of a compressed directory (.zip)
Input file containing text, e.g. in DOC, DOCX, XLSX, TXT format.
A UTF-8 text file limited to 1 GB.
In Colab: Any2txt - Conversion of a file containing text to text
(C) CLARIN-PL