Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
189 views
in Technique[技术] by (71.8m points)

javascript - Filter array of strings based on a pattern with placeholders

I have been struggling to do this (every DAY!) for at least a month. I have searched stackoverflow, I have read MDN array, string, regex, etc., references over and over and over again, and nothing has helped. I am somewhat familiar with regex, but this is over my head. I trust that somebody here will solve this with one line of code, which is why I waited until I'm about to throw my computer out the window before asking for help. I really wanted to find the solution for myself, but I simply cannot do it.

I was enjoying a game of cryptograms, where random letters are used to sort of 'encode' a poem or story, I probably don't need to describe it here, but here's a picture just in case.

enter image description here

So I thought it would be a good exercise to create a form where you can enter a pattern made up of a combination of letters, numbers, and "?" for unknown. In the image, you see the word represented with "YACAZ", there are two A's in that word, so you know those two letters are the same. So in my function, you would use any number 0 - 9 as placeholders, so using the same example, you would enter "?1a1?".

Here's what I have at the moment. Every time I try to iterate through the arrays that regex gives me, I end up at the same place, trying - and failing - to compare two sets of nested arrays with each other. No matter how I try to break them down and compare them, it ends up becoming a huge non-functioning mess. I can get the placeholder indexes, but then what?

I have nothing against lodash, but I have very little experience with it, so maybe it could help with this? It doesn't do anything that cannot be done with plain vanilla javascript, does it?

const words = [
  { word: 'bargain', score: 1700 },
  { word: 'balloon', score: 1613 },
  { word: 'bastion', score: 1299 },
  { word: 'babylon', score: 634 },
  { word: 'based on', score: 425 },
  { word: 'bassoon', score: 371 },
  { word: 'baldwin', score: 359 },
  { word: 'bahrain', score: 318 },
  { word: 'balmain', score: 249 },
  { word: 'basilan', score: 218 },
  { word: 'bang on', score: 209 },
  { word: 'baseman', score: 204 },
  { word: 'batsman', score: 204 },
  { word: 'bakunin', score: 143 },
  { word: 'barchan', score: 135 },
  { word: 'bastian', score: 133 },
  { word: 'balagan', score: 118 },
  { word: 'balafon', score: 113 },
  { word: 'bank on', score: 113 },
  { word: 'ballpen', score: 111 },
]

const input = 'ba1122n' // those are numeric 1's, not lowercase L's

//matching words from the list above should be 'balloon' and 'bassoon', using the input 'ba1122n'.

export const stringDiff = (a, b) => {
  let match = false,
    error = ''
  const results = []

  // Idk why I have a reducer inside a loop. I have tried many, many, MANY other
  // ways of looping, usually 'for (const i in whatever)` but they all end up with
  // the same problem. I usually have a loop inside a reducer, not the other way around.
  
  const forLoop = (array) => {
   
    a.reduce((acc, curr, next) => {
      const aa = [...curr.input.matchAll(curr[0])] // this tells me how many 0's, 1's, etc.

      const bChar = b.charAt(curr.index) // this tells me what letters are at those index positions
      const bb = [...b.matchAll(bChar)] // if the array 'bb' is not the same length, it's not a match
      if (aa.length === bb.length) {
        /* console output:
        word bargain

        aa:
        0: ["2", index: 4, input: "ba1122n", groups: undefined]
        1: ["2", index: 5, input: "ba1122n", groups: undefined]

        bb:
        0: ["a", index: 1, input: "bargain", groups: undefined]
        1: ["a", index: 4, input: "bargain", groups: undefined]
        */
       
        // matching the lengths only helps narrow down ***some*** of the non-matching words.
        // How do I match each index of each letter in each word with
        // each index of each placeholder character??? And check the letters match ***EACH OTHER***????
        // with any number of placholders for any digit 0 - 9?
      }
    }, [])

    return array
  }

  console.log('forLoop', forLoop([]))

  return { match, results, error }
}

stringDiff(words,input)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

From the above comment of mine, I'm still not quite sure whether the next provided approach does somehow meet the OP's goal.

But if it is about creating a regex from a custom replacement/substitute pattern and then just filtering a wordlist by this regex (and maybe even capturing the correct characters, one might give the following code a try.

There is a limitation to it though; The digit range for describing the custom placeholder pattern is limited from 1 to 9 (Zero will be excluded) since this matches exactly the definition/limitation of regex capture groups (and how one does access them).

function createRegexFromSubstitutePattern(pattern) {
  // - turn e.g. `ba1122n` into `/ba(w)1(w)2n/`
  // - turn e.g. `?1a1?` into `/.(w)a1./`
  // - turn e.g. `?1b22a1?` into `/.(w)b(w)2a1./`
  return RegExp(
    [1, 2, 3, 4, 5, 6, 7, 8, 9].reduce((regXString, placeholder) =>

      // programmatically replace the first occurrence of
      // any digit (from 1 to 9) with a capture group pattern
      // for a single word character.
      regXString.replace(RegExp(placeholder, ''), '(\w)'),

      // provide the initial input/pattern as start value.
      String(pattern)
    )
    // replace any further occurrence of any digit (from 1 to 9)
    // by a back reference pattern which matches the group's index.
    .replace((/([1-9])/g), '\$1')

    // replace the wildcard placeholder with the regex wildcard.
    .replace((/?/g), '.'), '');
}

const wordList = [
  { word: 'bargain', score: 1700 },
  { word: 'balloon', score: 1613 },
  { word: 'bastion', score: 1299 },
  { word: 'babylon', score: 634 },
  { word: 'based on', score: 425 },
  { word: 'bassoon', score: 371 },
  { word: 'baldwin', score: 359 },
  { word: 'bahrain', score: 318 },
  { word: 'balmain', score: 249 },
  { word: 'basilan', score: 218 },
  { word: 'bang on', score: 209 },
  { word: 'baseman', score: 204 },
  { word: 'batsman', score: 204 },
  { word: 'bakunin', score: 143 },
  { word: 'barchan', score: 135 },
  { word: 'bastian', score: 133 },
  { word: 'balagan', score: 118 },
  { word: 'balafon', score: 113 },
  { word: 'bank on', score: 113 },
  { word: 'ballpen', score: 111 },
];
const input = 'ba1122n';

const regXWord = createRegexFromSubstitutePattern(input);

console.log(
  'filter word list ...',
  wordList
    .filter(item => regXWord.test(item.word))
);
console.log(
  "filter word list and map each word's match and captures ...",
  wordList
    .filter(item => regXWord.test(item.word))
    .map(item => item.word.match(regXWord))
);

console.log(
  "createRegexFromSubstitutePattern('ba1122n')",
  createRegexFromSubstitutePattern('ba1122n')
);
console.log(
  "createRegexFromSubstitutePattern('?1a1?')",
  createRegexFromSubstitutePattern('?1a1?')
);
console.log(
  "createRegexFromSubstitutePattern('?1b22a1?')",
  createRegexFromSubstitutePattern('?1b22a1?')
);
.as-console-wrapper { min-height: 100%!important; top: 0; }

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...