Documentation

extract texts from PDF files and put them in XML