The iron ML notebook
  • The iron data science notebook
  • ML & Data Science
    • Frequent Questions
      • Discriminative vs Generative models
      • Supervised vs Unsupervised learning
      • Batch vs Online Learning
      • Instance-based vs Model-based Learning
      • Bias-Variance Tradeoff
      • Probability vs Likelihood
      • Covariance vs Correlation Matrix
      • Precision vs Recall
      • How does a ROC curve work?
      • Ridge vs Lasso
      • Anomaly detection methods
      • How to deal with imbalanced datasets?
      • What is "Statistically Significant"?
      • Recommendation systems methods
    • Statistics
      • The basics
      • Distributions
      • Sampling
      • IQR
      • Z-score
      • F-statistic
      • Outliers
      • The bayesian basis
      • Statistic vs Parameter
      • Markov Monte Carlo Chain
    • ML Techniques
      • Pre-process
        • PCA
      • Loss functions
      • Regularization
      • Optimization
      • Metrics
        • Distance measures
      • Activation Functions
      • Selection functions
      • Feature Normalization
      • Cross-validation
      • Hyperparameter tuning
      • Ensemble methods
      • Hard negative mining
      • ML Serving
        • Quantization
        • Kernel Auto-Tuning
        • NVIDIA TensorRT vs ONNX Runtime
    • Machine Learning Algorithms
      • Supervised Learning
        • Support Vector Machines
        • Adaptative boosting
        • Gradient boosting
        • Regression algorithms
          • Linear Regression
          • Lasso regression
          • Multi Layer Perceptron
        • Classification algorithms
          • Perceptron
          • Logistic Regression
          • Multilayer Perceptron
          • kNN
          • Naive Bayes
          • Decision Trees
          • Random Forest
          • Gradient Boosted Trees
      • Unsupervised learning
        • Clustering
          • Clustering metrics
          • kMeans
          • Gaussian Mixture Model
          • Hierarchical clustering
          • DBSCAN
      • Cameras
        • Intrinsic and extrinsic parameters
    • Computer Vision
      • Object Detection
        • Two-Stage detectors
          • Traditional Detection Models
          • R-CNN
          • Fast R-CNN
          • Faster R-CNN
        • One-Stage detectors
          • YOLO
          • YOLO v2
          • YOLO v3
          • YOLOX
        • Techniques
          • NMS
          • ROI Pooling
        • Metrics
          • Objectness Score
          • Coco Metrics
          • IoU
      • MOT
        • SORT
        • Deep SORT
  • Related Topics
    • Intro
    • Python
      • Global Interpreter Lock (GIL)
      • Mutability
      • AsyncIO
    • SQL
    • Combinatorics
    • Data Engineering Questions
    • Distributed computation
      • About threads & processes
      • REST vs gRPC
  • Algorithms & data structures
    • Array
      • Online Stock Span
      • Two Sum
      • Best time to by and sell stock
      • Rank word combination
      • Largest subarray with zero sum
    • Binary
      • Sum of Two Integers
    • Tree
      • Maximum Depth of Binary Tree
      • Same Tree
      • Invert/Flip Binary Tree
      • Binary Tree Paths
      • Binary Tree Maximum Path Sum
    • Matrix
      • Set Matrix Zeroes
    • Linked List
      • Reverse Linked List
      • Detect Cycle
      • Merge Two Sorted Lists
      • Merge k Sorted Lists
    • String
      • Longest Substring Without Repeating Characters
      • Longest Repeating Character Replacement
      • Minimum Window Substring
    • Interval
    • Graph
    • Heap
    • Dynamic Programming
      • Fibonacci
      • Grid Traveler
      • Can Sum
      • How Sum
      • Best Sum
      • Can Construct
      • Count Construct
      • All Construct
      • Climbing Stairs
Powered by GitBook
On this page

Was this helpful?

  1. Algorithms & data structures
  2. Array

Rank word combination

"""
Consider a "word" as any sequence of capital letters A-Z (not limited to just "dictionary words").
For any word with at least two different letters, there are other words composed of the same letters but in a different order
(for instance, STATIONARILY/ANTIROYALIST/AAIILNORSTTY).

We can then assign a number to every word, based on where it falls in an alphabetically sorted list of all words made up of the same group of letters.

Given a word, return its number.
Your function should be able to accept any word 25 letters or less in length (possibly with some letters repeated).

Sample words, with their rank:
ABAB = 2
AAAB = 1
BAAA = 4

"""

from itertools import permutations


def list_position(word: str):
    """
    time complexity: O(!n)
    """
    word_perms = set(["".join(perm) for perm in permutations(list(word))])

    word_perms = list(sorted(word_perms))

    return word_perms.index(word) + 1

# 1. Findout which unique characters we have (sorted)
# 2. Iterate our string
# 3. In each iteration, fix current char and compute perm
# return result

# BCA  

(3 - 0 - 1)! = 2 -> 2 * (1 ) = 2

(2 - 1)

def list_position(word: str):

    sorted_word = sorted(word)

    hashmap = {c: i for i, c in enumerate(sorted_word)}

    position = 0
    for i, char in enumerate(word):
        index_in_sorted = hashmap[char]

        position += math.factorial((len(word) - i - 1)) * index_in_sorted
        hashmap.pop(char)

        # position += math.factorial((len(word) - i - 1)) * index_in_sorted
        # hashmap.pop(char)

    return position

def list_position(word):
    """
    Given a word
    1. Get total num of permutations
    2. take the first letter
        2i. How many permutations have happened before this?
    3. Take the second letter.
        i. How many permutations before this?
    4. ...
    """
    count = 0
    while len(word):
        # get unique letters and num permutations
        char_set = set(word)

        # get number of permutations and deal with duplicates
        num_permutations = factorial(len(word))
        for ch in char_set:
            num_permutations /= factorial(word.count(ch))

        # start with the first letter
        # iterate through the set of unique chars
        # if the first letter comes after the char, we add on the num permutation that have passed
        first_ch = word[0]
        for ch in char_set:
            if ch < first_ch:
                count += num_permutations / len(word) * word.count(ch)

        word = word[1:]

    return count + 1


# testing code
words_and_solutions = {
    "ABC": 1,  # 
    "BCA": 4,  # ABC, ACB, BAC all before
    "CBA": 6,  # ABC, ACB, BAC, BCA, CAB
    "AAAB": 1,
    "ABAB": 2,  # AABB, ABAB, ABBA, BAAB, BABA, BBAA
    "BAAA": 4,
    "QUESTION": 24572,
    "BOOKKEEPER": 10743
    }
for word, solution in words_and_solutions.items():
    output = list_position(word=word)
    if output != solution:
        print(f"ERROR: {word} ({output}!={solution})")
    else:
        print(f"Correct: {word} ({output}=={solution})")
PreviousBest time to by and sell stockNextLargest subarray with zero sum

Last updated 2 years ago

Was this helpful?