When you have large number of patterns (dozens) to scan to find out which pattern is matching a given string, there's a few things you can do to speed up the job.

If the patterns are hard coded, there is of course any number of ways that you can be clever. But if you do not know what the patterns look like beforehand, which is the case when you're trying to match input strings against patterns in [DEPRECATED GlobalStrings.lua] using a formatstring-to-regex utility like BabbleLib's Deformat() function.

The approach below works by making lists of words used by patterns, and then looking at words in the input strings to determine which list(s) to look for matches in.

Actually, the process is 2-pass. The first pass figures out the LEAST commonly used words, and then just uses those.

  • Note: The example contains a very simplistic "MyDeformatterFunc()" for converting "%s" to "(.*)". It will not work for other locales than english. Do not use it in the real world, please.

-- Functions that we want called for different string matches
function RoughPokeFunc(v1,v2) print("RoughPokeFunc "..v1.." "..v2); end
function SoftPokeFunc(v1,v2) print("SoftPokeFunc "..v1.." "..v2); end
function SoftNudgeFunc(v1,v2) print("SoftNudgeFunc "..v1.." "..v2); end
function ChickenFunc(v1,v2) print("ChickenFunc "..v1.." "..v2); end

-- Strings to match mapped to functions that we want called
MatchStrings = {
  ["%s roughly pokes %s"] = RoughPokeFunc,
  ["%s softly pokes %s"] = SoftPokeFunc,
  ["%s softly nudges %s"] = SoftNudgeFunc,
  ["%s gets nudged by %s and runs away screaming"] = ChickenFunc,

-- VERY simplistic deformatter function. 
-- You probably want a real deformatting library for this.
function MyDeformatterFunc(str)	
  return (string.gsub(str, "%%s", "(.*)"));

-- First run: count how many occurences there are of each word
WordCounts = {}
for str,func in MatchStrings do
  for word in string.gfind(str, "[^ ]+") do
  	if(string.find(word, "^%%")) then
  		-- ignore format strings
  		WordCounts[word] = (WordCounts[word] or 0) + 1;

-- Second run: for each string, pick the least common word and place string in that hash bucket
MatchStringsHash = {}
for str,func in MatchStrings do
  local bestword, num;
  for word in string.gfind(str, "[^ ]+") do
  	if(string.find(word, "^%%")) then
  		-- ignore format strings
  		if(not num or WordCounts[word] < num) then
  			num = WordCounts[word];
  			bestword = word;
  if(not MatchStringsHash[bestword]) then MatchStringsHash[bestword] = {}; end
  MatchStringsHash[bestword][MyDeformatterFunc(str)] = func;

WordCounts = nil; -- now we don't need the counts anymore

-- Dump our MatchStringsHash on-screen so we can see what it looks like!
print "Examining hash buckets"
print "----------------------"
for word,strings in MatchStringsHash do
  print("  "..word..":");
  for str,func in strings do
  	print("    \""..str.."\"");

-- Function that scans for matches and calls the resulting function
function ScanForMatch(str)
  local bDone = false;
  local nCompares = 0;
  for word in string.gfind(str, "[^ ]+") do
  	if(MatchStringsHash[word]) then
  		for pattern,func in MatchStringsHash[word] do
  			nCompares = nCompares + 1;
  			local success,_,v1,v2,v3,v4 = string.find(str, pattern);
  			if(success) then
  	if(bDone) then break; end

  print("  \""..str.."\": "..nCompares.." string.finds actually executed\n");


ScanForMatch("Alice roughly pokes Bob");
ScanForMatch("Bob softly pokes Charles");
ScanForMatch("Charles softly nudges Denise");
ScanForMatch("Denise gets nudged by Eve and runs away screaming");
ScanForMatch("This string does not exist");

Running the above produces the following output:

Examining hash buckets
    "(.*) roughly pokes (.*)"
    "(.*) softly nudges (.*)"
    "(.*) gets nudged by (.*) and runs away screaming"
    "(.*) softly pokes (.*)"

RoughPokeFunc Alice Bob
  "Alice roughly pokes Bob": 1 string.finds actually executed

SoftPokeFunc Bob Charles
  "Bob softly pokes Charles": 1 string.finds actually executed

SoftNudgeFunc Charles Denise
  "Charles softly nudges Denise": 2 string.finds actually executed

ChickenFunc Denise Eve
  "Denise gets nudged by Eve and runs away screaming": 1 string.finds actually executed

  "This string does not exist": 0 string.finds actually executed

Problems with this approach Edit

There is no guarantee as to which order the string matches will be attempted.

For example, assume these two patterns:

  1. "%s hits %s."
  2. "%s hits %s hard."

Now, given the input string "Alice hits Bob.", only #1 will match, and all is good.

But with the input string "Alice hits Bob hard.", there is NO guarantee which string will match. You can get #1 with the arguments "Alice", "Bob hard". Or you can get #2 with the arguments "Alice", "Bob".

Ad blocker interference detected!

Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.