Day - 1 of the series - The challenge is a variation on 3-sum problem that goes like this:
You have an array. Find 3 values such that their sum equals a target value
Click here for part-1
Having solved the Part-1 (Attempt-1) earlier I quickly jumped to extend it with an array of arrays as below
""" We are going to build an upper triangular matrix where
- for each new number seen
-- We first check if any of the existing entries in non-first row add up to give required number
-- If that fails, then we append a new list of size "one" higher than current
- Every row (or list) holds sum of its first entry with all other firsts in order from top to bottom
-
- a b c d e f g
- a+b a+c a+d a+e a+f a+g
- b+c b+d b+e b+f b+g
- c+d c+e c+f c+g
- d+e d+f d+g
- e+f e+g
- f+g
"""
def execute():
counter = -1
for line in lines:
"""
for i in arr:
print(i)
print("") """
val = int(line)
counter += 1
if counter == 0:
arr.append([])
arr[0].append([])
arr[0][0] = val
continue
if counter == 1:
arr.append([])
arr[1].append([])
arr[1][0] = val
arr[1].append(arr[0][0] + arr[1][0])
continue
arr_enum_iterable = enumerate(iter(arr))
next(arr_enum_iterable)
for row_idx, row in arr_enum_iterable:
row_enum_iterable = enumerate(iter(row))
next(row_enum_iterable)
for sum_idx, sum in row_enum_iterable:
if val + sum == 2020:
print("Result: " + str(arr[row_idx][0] * arr[sum_idx - 1][0] * val))
return
arr.append([])
arr[counter].append([])
arr[counter][0] = val
for item in range(counter):
arr[counter].append(arr[item][0] + val)
lines = open('input-1.txt', 'r')
arr = []
execute()
However, by the time of writing this blog post, I already assimilated into me that position-less maps/sets are to be preferred to an array. So I start thinking of maintaining 2 datastructures instead.
Here’s the time taken in 3 consecutive runs:
shashank@shashank-HP-ENVY-Notebook:~/Projects/personal/programming-challenges/advent-of-code/2020/day-1$ time python3 part-2-attempt-1.py
Result: 85555470
real 0m0.089s
user 0m0.081s
sys 0m0.008s
shashank@shashank-HP-ENVY-Notebook:~/Projects/personal/programming-challenges/advent-of-code/2020/day-1$ time python3 part-2-attempt-1.py
Result: 85555470
real 0m0.091s
user 0m0.087s
sys 0m0.004s
shashank@shashank-HP-ENVY-Notebook:~/Projects/personal/programming-challenges/advent-of-code/2020/day-1$ time python3 part-2-attempt-1.py
Result: 85555470
real 0m0.093s
user 0m0.089s
sys 0m0.004s
shashank@shashank-HP-ENVY-Notebook:~/Projects/personal/programming-challenges/advent-of-code/2020/day-1$
We can’t do with sets of course and need hashMap since after finding the target sum (2020 - value), we need to locate which 2 items led to that sum originally
""" We are going to leverage 2 separate datastructures:
(1) an array so we can retrieve numbers read via their sum (stored in the next datastructure)
(2) a hashmap where
- 'key' is a sum of 2 values and
- 'value' is an ordered pair (a tuple) representing indices of the two values (from previous datastructure) that led to this sum
- Now for the process,
-- for each new value seen
-- We first check if (2020 - value) exists in 'hashmap' as 'key'
--- If 'yes', we retrieve its value for positions
---- Thereafter, retrieval of the 2 items from 'array' for multiplication are simply constant time operations. Then, we multiply, print and break
-- If 'no', then
--- (1) for each element already in array, we calculate sum and place it in our 'hashMap' along with positions (the second entry in position tuple being total size of set) and then
--- (2) append this value to 'array' so it appears at that (1 + total older size of set)
-
- (1) array: a b c d e
- (2) hashmap: a+b (0,1)
- a+c (0,2)
- b+c (1,2)
- a+d (0,3)
- b+d (1,3)
- c+d (2,3)
- a+e (0,4)
- b+e (1,4)
- c+e (2,4)
- d+e (3,4)
"""
def execute():
arr = []
hashmap = {}
for line in lines:
"""
print("arr:")
print(arr)
print("hashmap:")
print(hashmap)
print("") """
val = int(line)
if len(arr) == 0:
arr.append(val)
continue
if len(arr) == 1:
hashmap[arr[0] + val] = (0, 1)
arr.append(val)
continue
if (2020 - val) in hashmap:
position_1, position_2 = hashmap[2020 - val]
print("Result: " + str(arr[position_1] * arr[position_2] * val))
return
hashmap = {**hashmap, **{(entry + val): (index, len(arr)) for index, entry in enumerate(arr)}}
arr.append(val)
lines = open('input-1.txt', 'r')
execute()
Here’s the time taken in 3 consecutive runs:
shashank@shashank-HP-ENVY-Notebook:~/Projects/personal/programming-challenges/advent-of-code/2020/day-1$ time python3 part-2-attempt-2.py
Result: 85555470
real 0m0.027s
user 0m0.027s
sys 0m0.000s
shashank@shashank-HP-ENVY-Notebook:~/Projects/personal/programming-challenges/advent-of-code/2020/day-1$ time python3 part-2-attempt-2.py
Result: 85555470
real 0m0.028s
user 0m0.020s
sys 0m0.009s
shashank@shashank-HP-ENVY-Notebook:~/Projects/personal/programming-challenges/advent-of-code/2020/day-1$ time python3 part-2-attempt-2.py
Result: 85555470
real 0m0.047s
user 0m0.038s
sys 0m0.009s
shashank@shashank-HP-ENVY-Notebook:~/Projects/personal/programming-challenges/advent-of-code/2020/day-1$
This renewed approach reduced runtime by a factor of 3 (and often only by a factor of 2; still better)
Yet, another hole surfaces up this time. Why are we storing indices in one datastructure (hashmap) and later using those to retrieve from another datastructure (array)?
We could just be storing the values themselvves inside the ‘positional pairs’ and avoid array altogether. Right?
No. When we read in, say, a 10th item, we still need to generate its sum from values 0 to 9. If we don’t have an array of all seen so far values, we cannot obtain this sum
We still need to have the 2 datastructure as separate. Do you have a better approach? Tweet as a reply or Open an Issue referencing the title.