Little python exercise

All

Intro

Recently I did an interview with the following problem:

You have a list of strings, on our list, some of the strings were corrupted on the reading process. You want to avoid duplications so do an algorithm to solve this.

List example: ttoo, toto, oott.

Example output: toto.

List example 2:  xt, tx.

Example output 2: xt.

Solutions

There are several ways to solve the problem and basically, you can understand it as a sort algorithm, a kind of special sort, but a sort. Therefore a bubble sort or insertion sort can be applied.

Solution 1

My second solution was to come back to a dict and try to use it to optimize the size comparison. However, since I’m using a dict (i.e. an encoding operation) I will need therefore an operation for translate it in a list (i.e. decoding operation).

Part A – encoding:

for each_word in words:
    key = len(each_word)
    if key not in result:
        result[key] = [each_word]
    else:
        list_values = result[key]
        list_values.append(each_word)
        result[key] = list_values

Part A2: I use an aux function to end correctly in the dict

def create_new_list(list_values):
    #['ot', 'to', 'tt']
    #['to', 'tt']
    sorted(list_values)

    list_new = [list_values[0]]
    flag = True
    for i in range(1, len(list_values)):
        flag = True
        for each in list_new:
            if set(list_values[i]) == set(each):
                flag = False
        if flag:
            list_new.append(list_values[i])
    return list_new

Solution 2

This was actually my first solution on the head, but second to be implemented. And its more or less based on the insertion sort algorithm and basically inserts an element on the final list if and only if the set of this element not there.  [I mean based on insertion because you will get the elements on the list and compare with the final list.] + the survival flag for each element.

However, you still need to do the size comparison to avoid tt in ttoo problems:

flag = True
for i in range(1, len(words)):
    flag = True
    for each in final:
        if len(words[i]) == len(each) and set(words[i]) == set(each):
            flag = False
    if flag:
        final.append(words[i])
return final

The solution can be improved by using the sorted using len option (i) + the comparison is first done on the length and then on the set, avoiding unnecessary comparisons.

(i) sorted(xs, key=len)

Solution 3

Solution 3 would use a dict comprehension on the first solution. But after thinking about it, it just too noisy and unclear for the reader.

(i) I’m creating the values of the Dict on the fly, so I don’t need an extra function to do it for me (encoding).

(ii) I just put a list comprehension in the end (for the decoder operation).

(i)for each_word in words:
    key = len(each_word)
    if key not in result:
        result[key] = [each_word]
    else:
        list_values = result[key]
        # 2:[cra] - car
        for each in list_values:
            if set(each) != set(each_word):
                list_values.append(each_word)
                result[key] = list_values
(ii) return [result.get(each_key)[0] for each_key in result.keys()]

 

Solution 4

It’s totally possible to do it in just one list comprehension, though it would be very long though, basically because one would need to condense the survival code within the list comprehension. I don’t think it worth though.

Source

All the solutions are there.

https://github.com/FranciscoMeloJr/Python-Tests/tree/master/banks

REFs

[1] https://www.datacamp.com/community/tutorials/python-dictionary-comprehension

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s