In software puzzles, grouping words that contain the exact same letters in different orders (known as anagrams) is a classic problem. By sorting the characters of each word alphabetically, you can generate a unique signature or "fingerprint" for that letter combination, allowing you to group matching anagrams easily using a map structure.
In this guide, we will explain this algorithm using a fridge magnet sorting analogy and walk through its trace in Java.
Imagine you have a pile of scrambled word cards on the fridge, like "tlebee", "beeetl", and "beetle". You want to group all the word cards that share the exact same letters together.
To do this systematically, you establish a sorting rule:
- Sort the letters alphabetically: Pick up the letters of
"tlebee"and line them up:[b, e, e, e, l, t]. This sorted key is the word's fingerprint. - Put them in labeled boxes: Place the original card
"tlebee"in a box labeled "beeelt". - Group: Repeat the process for
"beeetl". Its letters also sort to[b, e, e, e, l, t], so it goes in the same box.
By the end, the box labeled "beeelt" contains all matching anagrams: "tlebee", "beeetl", and "beetle"!
Walkthrough of the Main Method Scenario
Let's trace how the program executes step-by-step from the entry point of the main method:
The program defines a list of words containing various anagram families:
List<String> list = List.of("apple", "pineapple", "appleinep", "beetle", "tlebee", "beeetl", "lapep", "atyeti");
For example, "apple" and "lapep" belong to the same anagram family, while "beetle", "tlebee", and "beeetl" belong to another.
The program iterates through the list, sorting each word and incrementing its count in a Map:
list.forEach(eachWord -> {
char[] arr = eachWord.toCharArray();
Arrays.sort(arr);
String finalValue = String.valueOf(arr);
if (!map.containsKey(finalValue)) {
map.put(finalValue, 1);
} else {
int count = map.get(finalValue);
map.put(finalValue, ++count);
}
});
For "apple", the sorted characters are "aelpp". It is added to the map with count 1. When "lapep" is processed, it also sorts to "aelpp", and its count is incremented to 2.
Next, the program uses the Java Stream API to group, sort, and retrieve the largest anagram family:
Map.Entry<String, Long> stringLongEntry = list.stream()
.map(eachWord -> {
char[] arr = eachWord.toCharArray();
Arrays.sort(arr);
return String.valueOf(arr);
})
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet().stream()
.sorted((o1, o2) -> o2.getValue().compareTo(o1.getValue()))
.findFirst()
.get();
- Map: Converts all words to their sorted forms (e.g. "apple" -> "aelpp").
- groupingBy: Groups identical sorted strings and counts them:
{beeelt=3, aelpp=2, ...}. - Sort: Sorts the map entries by count descending.
- findFirst: Retrieves the entry with the highest frequency, which is
beeelt=3.
Java Implementation
Below is the complete Java code demonstrating character-sorting and stream-based grouping for anagram frequency collection:
package io.practise.accolite;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.function.Function;
import java.util.stream.Collectors;
public class StringGrouping {
public static void main(String[] args) {
List<String> list = List.of("apple", "pineapple", "appleinep", "beetle", "tlebee", "beeetl", "lapep", "atyeti");
Map<String, Integer> map = new HashMap<>();
list.forEach(eachWord -> {
char[] arr = eachWord.toCharArray();
Arrays.sort(arr);
String finalValue = String.valueOf(arr);
if (!map.containsKey(finalValue)) {
map.put(finalValue, 1);
} else {
int count = map.get(finalValue);
map.put(finalValue, ++count);
}
});
Map.Entry<String, Long> stringLongEntry = list.stream()
.map(eachWord -> {
char[] arr = eachWord.toCharArray();
Arrays.sort(arr);
return String.valueOf(arr);
})
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet().stream()
.sorted((o1, o2) -> o2.getValue().compareTo(o1.getValue()))
.findFirst()
.get();
System.out.println(stringLongEntry);
}
}
Conclusion
Grouping anagrams is highly efficient when you convert scrambled strings to a common sorted representation. This sorted signature acts as the primary key in your grouping logic. Using Java Streams makes the process of mapping, grouping, and sorting collections of data expressive and clean.