palavras-chave @Note2 Biomedical text mining Information retrieval task PDF to text conversion Patents