Named entity recognition and text compression
➤ Gửi thông báo lỗi ⚠️ Báo cáo tài liệu vi phạmNội dung chi tiết: Named entity recognition and text compression
Named entity recognition and text compression
VSB-TECHNICAL UNIVERSITY OF OSTRAVAFACULTY OF ELECTRICAL ENGINEER ING Ajvjy COMPUTER SCIENCEDEPARTMENT OF COMPUTER SCIENCEPHD THESISStudy branch : Com Named entity recognition and text compression mputer ScienceNamed Entity Recognition and Text CompressionAuthor : Vu Nguyen HongSupervisor : Prof. R.XDr. Vaclav S.NÁSELAcknowledgementsThis thesis is I her nwult. of research carried Olli during my Pill) program al VSIỈ-Teclmical University of Ostrava, Czech Republic. Il is my pleasure to thank a Named entity recognition and text compression ll those who have helped me.First. I would like to express my deep appreciation to my academic and the sis supervisor. Professor Vaclav Snásel, who haNamed entity recognition and text compression
s been vigorously supervising my studies, supporting my research, and has lieeii constantly involved in guiding me towards my goal. 'This thesis wouldVSB-TECHNICAL UNIVERSITY OF OSTRAVAFACULTY OF ELECTRICAL ENGINEER ING Ajvjy COMPUTER SCIENCEDEPARTMENT OF COMPUTER SCIENCEPHD THESISStudy branch : Com Named entity recognition and text compression r the consecutive trips to Vietnam to guide me so that I was able to achieve my goal of completing my research and this thesis project. I will never forget everything he has done for me since I arrived in the Czech Republic.Iam really grateful Io Dr. nil'll Nguyen Thanh, my Sl'cond thesis supervisor Named entity recognition and text compression , for his guidance, feedback, and comments during my research. lie has given me advice and guided me how to approach and explore now challenges anil hNamed entity recognition and text compression
ow to divide an overwhelming task into smaller, more manageable tasks that arc more readily accomplished. This allowed me to take my first steps in thVSB-TECHNICAL UNIVERSITY OF OSTRAVAFACULTY OF ELECTRICAL ENGINEER ING Ajvjy COMPUTER SCIENCEDEPARTMENT OF COMPUTER SCIENCEPHD THESISStudy branch : Com Named entity recognition and text compression hinks will be valuable to me. 1 know that he spent a lot of his time with me and I wish to express my gratitude to him.1 am thankful to Dr. Phan Dao, Director of t he European Cooperation Center of Ton Due Thang University, Ho Chi Minh City, Vietnam, for giving me the opport unity to take part in th Named entity recognition and text compression e Sandwich Program. lie has advised me on what to do and how can I achieve my goals during my research. I will never forget everything he did for me tNamed entity recognition and text compression
he first time I went to the Czech Republic: he and his family were so kind to help me arrange my accommodations, develop my itinerary, anil choose somVSB-TECHNICAL UNIVERSITY OF OSTRAVAFACULTY OF ELECTRICAL ENGINEER ING Ajvjy COMPUTER SCIENCEDEPARTMENT OF COMPUTER SCIENCEPHD THESISStudy branch : Com Named entity recognition and text compression hank all of my colleagues, my friends, and my classmates in the Sandwich Program.Finally, I wish Io express my heart fell gratitude to my family for their love, encouragement, and support.; especially my beloved Phuong Pham.iỉAbstractIII nxxiiiL years, .social networks have liecoine very jxipular. I Named entity recognition and text compression I. is easy for users Io share their data using online social networks. Since data (HI social networks is idiomatic, irregular, brief, and includes acrNamed entity recognition and text compression
onyms and spelling errors, dealing with such data is more challenging than that of news or formal texts. With the huge volume of posts each day. effecVSB-TECHNICAL UNIVERSITY OF OSTRAVAFACULTY OF ELECTRICAL ENGINEER ING Ajvjy COMPUTER SCIENCEDEPARTMENT OF COMPUTER SCIENCEPHD THESISStudy branch : Com Named entity recognition and text compression ize Vietnamese informal text in social networks. This mel hod has the ability to identify and normalize informal lexl based on the structure of Vietnamese words. Vietnamese syllabic rules, and a tri gram model. After normalization, the data will be processed by a named entity recognition (NER) model Named entity recognition and text compression to identify and classify the named entities in these data. In our NKR model, we use six different types of features to recognize named entities categNamed entity recognition and text compression
orized ill three predefined classes: Person (PER), I .oration (LOC), and Organization (ORG).When viewing social network data, we found that the size oVSB-TECHNICAL UNIVERSITY OF OSTRAVAFACULTY OF ELECTRICAL ENGINEER ING Ajvjy COMPUTER SCIENCEDEPARTMENT OF COMPUTER SCIENCEPHD THESISStudy branch : Com Named entity recognition and text compression d, we use a trigram dictionary that is quite big, therefore we also need to decrease its size. To deal with this challenge, in this thesis, we propose three methods to comjnxss text files, especially ill Vietnamese text. The first method is a syllable-based method relying on the .structure of Vietna Named entity recognition and text compression mese morphosyllables. consonants, syllables and vowels. The second method Is tngram-based Vietnamese text compression based on a trigram dictionary. TNamed entity recognition and text compression
he last method is based on an n-gram slide window, in which we use five dictionaries for unigrams, bigrams, trigrams, four-grams and five-grams. This VSB-TECHNICAL UNIVERSITY OF OSTRAVAFACULTY OF ELECTRICAL ENGINEER ING Ajvjy COMPUTER SCIENCEDEPARTMENT OF COMPUTER SCIENCEPHD THESISStudy branch : Com Named entity recognition and text compression gnition, text compression.Contents1Introduction11.1Motivation ..................................................... 11.2Thesis objective and scope...................................... 31.3Thesis organization............................................. 32Background and related work52.1Vietnamese la Named entity recognition and text compression nguage processing resources........................ 52.1.1Structure of Vietnamese word............................. 52.1.2Typing methods..............Named entity recognition and text compression
............................. 62.1.3Standard morphosyllables dictionary...................... 62.2Text compression....................................VSB-TECHNICAL UNIVERSITY OF OSTRAVAFACULTY OF ELECTRICAL ENGINEER ING Ajvjy COMPUTER SCIENCEDEPARTMENT OF COMPUTER SCIENCEPHD THESISStudy branch : ComVSB-TECHNICAL UNIVERSITY OF OSTRAVAFACULTY OF ELECTRICAL ENGINEER ING Ajvjy COMPUTER SCIENCEDEPARTMENT OF COMPUTER SCIENCEPHD THESISStudy branch : ComGọi ngay
Chat zalo
Facebook