Function Optimization: ms to µs11 Mar 2024
While scouting Zed's repository, I came across an issue that caught my eye. It was calling out to improve the matches in the command pallete when comes to spaces. As shown, when the input query contains more spaces than intended, the matching accuracy deteriorates. The issue was open for a while and no one has taken it up.
My Genius Idea
With this foolproof plan in mind, I started to work on it. I opened the command_palette.rs file and found the function which is responsible for matching the user's query with list of commands.
That function goes something like this.
fn update_matches(
&mut self,
query: String,
cx: &mut ViewContext<Picker<Self>>,
) -> gpui::Task<()> {
let (mut tx, mut rx) = postage::dispatch::channel(1);
let task = cx.background_executor().spawn({
let mut commands = self.all_commands.clone(); // List of all the commands
let query = query.clone(); // copy of the users query
// the fuzzy matching logic using the query
// ...
// ...
});
}
By examining the function, it's clear that the input query is used as-is to match with the commands.
Here Comes Regex.
It was a simple task or so I thought. I wrote a simple regex.
// import Regex
use regex::Regex;
// set lazy static variable for regex
lazy_static! {
static ref STRIP_UNNECESSARY_WHITESPACES: Regex = Regex::new(r"s+").unwrap();
}
fn update_matches(
&mut self,
query: String,
cx: &mut ViewContext<Picker<Self>>,
) -> gpui::Task<()> {
let (mut tx, mut rx) = postage::dispatch::channel(1);
let task = cx.background_executor().spawn({
let mut commands = self.all_commands.clone();
let query = STRIP_UNNECESSARY_WHITESPACES.replace_all(&query.trim(), " ").to_string();
// the fuzzy matching logic
// ...
// ...
});
};
Yep. This is it. The solution essentially removes all excess whitespace. Since there's already a logic for fuzzy match, I don't need to consider any whitespaces, eiter at beginning or in the middle or at the end of the word. Now all that's left is, Raise a PR and get LGTM comment and be done with it. Right? Right???
User TerminalFi enters the chat.
He commented, "From a performance perspective and reducing imports. This manual method I believe is faster" and gave the below code (I added comments to make it more understandable).
fn trim_consecutive_whitespaces(input: &str) -> String {
// Pre-allocate the result string with the same capacity as the input string
let mut result = String::with_capacity(input.len());
// Keep track of whether the last character was whitespace
let mut last_char_was_whitespace = false;
// Iterate over the characters in the input string
for char in input.trim().chars() {
// If the character is whitespace
if char.is_whitespace() {
// If the last character was not whitespace
if !last_char_was_whitespace {
// Add the character to the result string
result.push(char);
}
// Update the last_char_was_whitespace flag
last_char_was_whitespace = true;
} else {
// Add the character to the result string
result.push(char);
// Update the last_char_was_whitespace flag
last_char_was_whitespace = false;
}
}
// Return the result string
result
}
My obvious reaction was "HOW..??"
I quickly ran a benchmark test. Sounds fancy right? All I did was start a timer when the function is called, then do the regex match for 10 iterations using for loop and stop the timer once it's done.
// Function using Regex
fn regex_method(input: &str) -> String {
CONSECUTIVE_WHITESPACES
.replace_all(input.trim(), " ")
.to_string()
}
let str = " Wei rd ly sp a ced sen ten se";
// start the timer
let start = Instant::now();
// run the regex match 10 times
for _ in 0..10 {
let _ = regex_method(str);
}
// stop the timer
let duration = start.elapsed();
// print the result
println!("{:?}", duration);
The result was 2.59644ms. This is blazing fast. how come manual method is quicker? By trying TerminalFi approach, not only I found his approach was optimal, but shocked to see the result. Manual approach for 10 iterations took merely 3.029µs
TerminalFi also gave the benchmark results for 100,000 iterations and the result were quite amazing.
Regex method took: 979.481709ms
Manual method took: 162.163583ms
That's the moment, I realised, I wrote ✨shit code✨.
Why?
To understand why the manual method is faster than Regex, we need to understand how Regex and manual method works.
-
Complexity: Regex involves a complex pattern matching engine for flexibility, while manual methods are straightforward and task-specific.
-
Overhead: Regex engines carry the overhead of unused features for simple tasks, unlike the direct approach of manual methods.
-
Optimization: Manual methods can be finely tuned for specific tasks, potentially reducing memory and CPU usage compared to the more general regex.
-
CPU Cache Utilization: Manual methods can make better use of the CPU cache through linear access patterns, unlike the less predictable access patterns of regex operations.
-
Compiling Regex: Regex requires compilation, adding initial overhead, whereas manual methods don't have this step.
Replaced my Regex solution with TerminalFi's manual method. I pushed my chages. The pull request was approved and merged into main.
Yay.🥂
< Back