title: Counting the Frequency of Different Words in a Text String in Golang and Printing Words and Frequencies More Than Once
date: 2022-05-22 15:17:00
toc: false
index_img: https://b3logfile.com/file/2022/05/Snipaste_2022-05-22_15-42-36-a560234d.png
category:
- Go
tags: - Golang
- Output
- Statistics
- Text
- String
- Different
- Words
- Occurrence
- Frequency
- Times
- Word Frequency
Code#
package main
import (
"fmt"
"sort"
"strings"
)
var s = "Fluff lettuce thoroughly, then mix with tea and serve smoothly in bottle.Enamel half a kilo of strudel in twenty and a half teaspoons of pork butt sauce."
var symbolCollection = []string{
".",
",",
}
func main() {
result := make(map[string]int)
fields := strings.Fields(s)
for _, v := range fields {
newV := strings.ToLower(v)
for _, sym := range symbolCollection {
newV = strings.Trim(newV, sym)
}
val, ok := result[v]
if ok {
val++
result[v] = val
} else {
result[v] = 1
}
}
unique := make([]string, 0, len(result))
for t := range result {
unique = append(unique, t)
}
sort.Strings(unique)
for _, k := range unique {
v := result[k]
if v > 1 {
fmt.Println(k, v)
}
}
}
Result#
a 2
of 2
and 2
in 2
half 2
Summary#
- Reviewed the usage steps of
Trim
,ToLower
, andFields
methods in thestrings
package. - Consolidated the usage of maps.
- Consolidated how to generate ordered output based on map keys.
- Consolidated how to iterate over a map using
range
. - Maps and slices are passed by reference.
Conclusion#
- Maps are versatile collectors for unstructured data.
- Composite literals are a very convenient way to initialize maps.
- The
range
keyword can be used to iterate over maps. - Maps still share the same underlying data even when assigned or passed to functions.
- Combining collectors can further enhance their power.
Supplement#
Although Go does not directly provide a collection collector, a map can be used to temporarily assemble a collection. For maps used as collections, the values of the keys are usually not important, but in order to facilitate checking membership of the collection, the values of the keys are usually set to true.
Original Answer#
package main
import (
"fmt"
"sort"
"strings"
)
func countWords(text string) map[string]int {
words := strings.Fields(strings.ToLower(text))
frequency := make(map[string]int, len(words))
for _, word := range words {
newWord := strings.Trim(word, `.,`)
frequency[newWord]++
}
return frequency
}
func main() {
var text = "Fluff lettuce thoroughly, then mix with tea and serve smoothly in bottle.Enamel half a kilo of strudel in twenty and a half teaspoons of pork butt sauce."
frequency := countWords(text)
unique := make([]string, 0, len(frequency))
for k := range frequency {
unique = append(unique, k)
}
sort.Strings(unique)
for _, word := range unique {
count := frequency[word]
if count > 1 {
fmt.Printf("%s: %d\n", word, count)
}
}
}
- The code should be simplified by combining functions instead of blindly pursuing the goal and ignoring code elegance and performance.
- The logic lacks consideration and is not familiar with native functions, such as the source of each word should be
strings.Fields(strings.ToLower(text))
instead of firstTrim
and then loop through the array for removal. - The value in the map can be directly modified, such as
frequency[newWord]++
, without the need forval++; result[v] = val
.