Let op
- This website is undergoing scheduled maintenance
Danilo Toapanta
v1.3
Home
Blog
Projects
About
Text Processing
Description of this Post
All
Information Retrieval
TAGS
Author
Danilo Toapanta
Published
February 8, 2024
Text Processing
Description of this Post
All
Information Retrieval
TAGS
All
Information Retrieval
TAGS
Author
Danilo Toapanta
Published
February 8, 2024
Slide 1
1
Text preprocessing
Slide 2
2
Outline
Slide 3
3
Outline
Slide 4
4
Outline
Slide 5
5
Zipf’s law
Slide 6
6
High-frequency words
Slide 7
7
Low-frequency words
Slide 8
8
Zipf’s law vs. real data
Slide 9
9
Outline
Slide 10
10
Heaps’ law
Slide 11
11
Outline
Slide 12
12
Text preprocessing pipeline
Slide 13
13
Example
Slide 14
14
Stop-word removal
Slide 15
15
Outline
Slide 16
16
Stemming
Slide 17
17
Algorithmic stemming (Porter stemmer)
Slide 18
18
Algorithmic stemming (Porter stemmer)
Slide 19
19
Dictionary-based stemming
Slide 20
20
Hybrid stemming (Krovetz stemmer)
Slide 21
21
Stemming example
Slide 22
22
Outline
Slide 23
23
Example
Slide 24
24
Dealing with phrases
Slide 25
25
Summary
Slide 26
26
Additional References
Slide 27