Sorting Algorithms
Working with Data Structures and manipulating data.
import random
numbers = []
for i in range(10):
numbers.append(random.randint(0,100))
print("Random List")
print(numbers)
Warm Up
Discuss with a partner... What are some strategies you would use to sort this list? (Don't worry about writing code for now)
- I would take each individual number and move them around in order to make the numbers least to greatest. I would do this by comparing that specific number to all the other numbers in the list and then sorting it based on that.
Explore
Get into groups of 3
We will be focusing on 4 algorithms today.
We will look at the first one together, Bubble Sort
What is happening with this sort?
In your groups you will each choose to be an expert on a sorting algorithm. Merge, Selection, and Insertion. Take about 5 minutes to read about your algorithm and be ready to explain it to your other group members.
- The process of merge sort is to divide the array into two halves, sort each half, and then merge the sorted halves back together. This process is repeated until the entire array is sorted.
- The algorithm repeatedly selects the smallest (or largest) element from the unsorted portion of the list and swaps it with the first element of the unsorted portion. This process is repeated for the remaining unsorted portion of the list until the entire list is sorted.
- The array is virtually split into a sorted and an unsorted part. Values from the unsorted part are picked and placed at the correct position in the sorted part.
Practice
[75, 17, 46, 80, 67, 45, 69, 79, 40, 0]
How would you sort this list with...
- Bubble Sort
-
Selection Sort
Explain.
-
Bubble Sort - swaps adjacent values until they are in the correct order.
- 75 and 17 would switch places, 46 and 75 would switch places and so on.
- Selection Sort - the smallest or largest value is swapped with the first term of the unsorted array
- 80 would be moved to the very end with swapping, then 79 would be moved with swapping and so on.
[88, 39, 53, 39, 58, 43, 74, 81, 71, 51]
How would you sort this list with...
- Merge Sort
-
Insertion Sort
Explain.
-
Merge Sort - splits list into parts and then swaps values until they sort
- 88, 39, 53, 39, 58 and 43, 74, 81, 71, 51 then those would get split and then those would get split until the list is sorted
- Insertion Sort - looks at unsorted value and then moves it until all values after it are greater
- looks at 39 and 88 and moves 39 before 88, continues with all other values until sorted
import nltk
import random
nltk.download('words') # Download the word list (only required once)
from nltk.corpus import words
english_words = words.words()
#print(len(english_words)) # Prints the number of words in the list
# You can now use the 'english_words' list in your code
words = []
for i in range(10):
words.append(english_words[random.randint(0,len(english_words))])
print("Random List")
print(words)
You would look at the first letter and compare them to all the other first letters of the words and that would allow you too look through and sort the words. In this case, with merge sort, you could split the list and then you would split those segments until you're left with the simplest parts of the array and then from there you would sort alphabetically.
Discuss
Answer the following with your group.
- When should you use each algorithm? What makes an algorithm the right choice?
- You would use the algorithm that would take the least amount of steps.
- Given the following lists...
- [0, 2, 6, 4, 8, 10] - You would use insertion because only one swap would be used. Bubble would also work.
- [Elephant, Banana, Cat, Dog, Apple] - Selection would work because only the first and last would need to be sorted.
- [29, 13, 83, 47, 32, 78, 100, 60, 65, 15, 24, 9, 40, 68, 53, 8, 90, 58, 39, 32, 34, 91, 74, 94, 49, 87, 34, 87, 23, 17, 27, 2, 38, 58, 84, 15, 9, 46, 74, 40, 44, 8, 55, 28, 81, 92, 81, 88, 53, 38, 19, 21, 9, 54, 21, 67, 3, 41, 3, 74, 13, 71, 70, 45, 5, 36, 80, 64, 97, 86, 73, 74, 94, 79, 49, 32, 20, 68, 64, 69, 1, 77, 31, 56, 100, 80, 48, 75, 85, 93, 67, 57, 26, 56, 43, 53, 59, 28, 67, 50] - Merge would be used because it is the longest because it's complexity is logarithmic.
HACKS
Provided below is a Bubble Sort Algorithm sorting a list of dictionaries based off of selected key.
-
Now it's time to do some coding...
-
Run code and then research and answer these questions...
- Is a list and/or dictionary in python considered a primitive or collection type? Why?
- A list and dictionary are both considered a collection type in python because they are a group of values that are stored. They can be edited, deleted, or added to the structure as necessary.
- Is the list passed into bubble sort "pass-by-value" or "pass-by-reference? Describe why in relation to output.
- The list in bubble sort is pass by value because it is passed as an argument in a function
- Is a list and/or dictionary in python considered a primitive or collection type? Why?
-
Implement new cell(s) and/or organize cells to do the following.
- Create your own list
- Use your expertise sorting algorithm (selection, insertion, merge). Note, I got my bubble sort from Geek for Geeks and made modifications. Each student in a group should have a unique algorithm.
- Test your list with my bubble sort
- Test my list with your new sort, do NOT make a copy my list when doing this
- Research analysis on sorting:comparisons, swaps, time. Build this into your hacks. - Find a better way to print the data, key first, then other elements in viewable form.
Use the code below to help guide your adventure
Research Analysis
-
Both bubble sort and merge sort seem to have the same time complexity in this case, however this is probably because the lists are very small. Theoretically merge sort should be faster than bubble sort because merge sort has a log time complexity while bubble sort has a n^2 time complexity, meaning that it gets slower for larger lists.
-
In terms of code, bubble sort takes a shorter amount of code while merge sort takes a larger amount of code because it is more complex than bubble sort.
"""
* Creator: Nighthawk Coding Society
Bubble Sort of a List with optimizations
"""
# bubble sorts a list of dictionaries, base off of provided key
def bubbleSort(list, key):
n = len(list) - 1 # list are indexed 0 to n-1, len is n
# Traverse through list with i index
for i in range(n):
swapped = False # optimize code, so it exits if now swaps on inner loop
# Inner traversal using j index
for j in range(n-i): # n-i as positions on right are in order in bubble
# Swap if the element KeyN is greater KeyN1
keyN = list[j].get(key)
keyN1 = list[j+1].get(key)
if keyN > keyN1:
swapped = True
list[j], list[j + 1] = list[j + 1], list[j] # single line swap
if not swapped: # if no swaps on inner pass, list is sorted
return # exit function
if __name__ == "__main__":
# list/dictionary sample
car_list = [
{"car": "Lamborghini", "model": "Aventador", "price": 500000, "horsepower": 700},
{"car": "Ferrari", "model": "488 GTB", "price": 250000, "horsepower": 660},
{"car": "Porsche", "model": "911 Turbo S", "price": 200000, "horsepower": 640},
{"car": "McLaren", "model": "720S", "price": 300000, "horsepower": 710},
{"car": "Bugatti", "model": "Chiron", "price": 3000000, "horsepower": 1500}
]
# assuming uniform keys, pick 1st row as source of keys
key_row = car_list [0]
# print list as defined
print("Original")
print(car_list )
for key in key_row: # finds each key in the row
print(key)
bubbleSort(car_list , key) # sort list of people
print(car_list )
from tabulate import tabulate
def merge_sort(lst, key):
if len(lst) <= 1:
return lst
mid = len(lst) // 2
left_half = lst[:mid]
right_half = lst[mid:]
left_sorted = merge_sort(left_half, key)
right_sorted = merge_sort(right_half, key)
return merge(left_sorted, right_sorted, key)
def merge(left, right, key):
merged = []
left_index = right_index = 0
while left_index < len(left) and right_index < len(right):
if left[left_index][key] <= right[right_index][key]:
merged.append(left[left_index])
left_index += 1
else:
merged.append(right[right_index])
right_index += 1
while left_index < len(left):
merged.append(left[left_index])
left_index += 1
while right_index < len(right):
merged.append(right[right_index])
right_index += 1
return merged
# Example usage
car_list = [
{"car": "Lamborghini", "model": "Aventador", "price": 500000, "horsepower": 700},
{"car": "Ferrari", "model": "488 GTB", "price": 250000, "horsepower": 660},
{"car": "Porsche", "model": "911 Turbo S", "price": 200000, "horsepower": 640},
{"car": "McLaren", "model": "720S", "price": 300000, "horsepower": 710},
{"car": "Bugatti", "model": "Chiron", "price": 3000000, "horsepower": 1500}
]
key_list = ["car", "model", "price", "horsepower"]
for key in key_list:
sorted_cars = merge_sort(car_list, key)
headers = ["Car", "Model", "Price", "Horsepower"]
rows = [[car["car"], car["model"], car["price"], car["horsepower"]] for car in sorted_cars]
print(f"Sorting by {key}:")
print(tabulate(rows, headers=headers, tablefmt="fancy_grid"))
print()