Boyin Zhang's Homepage

Introduction

Game, Coding and Research. Mostly play Pokemon(from GEN1 to GEN8), Dota2(former Divine player), PUBG(TPP rank 3%), Civilization, HOI4, Europa Universalis IV and nearly all simulation or strategy games!

More details regarding to recent projects and internships will be updated by March 2019.

My Projects

Research Projects

This was my senior thesis project supervised by Professor Svetlana Lazebnik, which was focusing on applying the state-of-art object detection algorithm SSD(Single-Shot-Detection) to solve the object detection problems on high resolution images. The first version of this model was using the sliding window method combining with SSD prediction model to do the detection, NMS(non-maximum suppression) algorithm was used to remove the replicate detection. The advanced version was using the AZ-net(adjacency and zoom in net) to reduce the number of detections by SSD, which accelerate the detection processes by the model. The final version of this project gives much better detection result on high resolution images than previous methods.

This research project was more about thinking. I worked with my friend Dajun Xu and PhD student Long Pham, supervised by Professor Kevin Chang. In the first period of this project, I came up with the idea of the BitMap(using the grid system to get the embedding of a webpage). Then I implemented the part of constructing the BitMap according to the data that Dajun got. Then I also wrote the scirpt for training using the Deep Learning framework Tensorflow. The thing that we still do not finish is the dataset(usually for the model to train, we need at least thousand of labeled data).

This was a work assigned by Professor Ruta Mehta, which was implemented in Python. In this project, I studied the knowledge of game theory(about allocating the source and biding). Then following the guidance of Professor Mehta, I implemented the model to predict the behaviors and gross capital changes in free market economy and in central economy.

Co-worked with PhD student Zengming Shen, combined the Spatial Transform Network to Caffe then used the model to do semantic segmentation on medicine photos. In this project, I began my journey on researches, started to read paper about the state-of-art technologies. I really appreciate Zengming for letting me working with him together. Following the guidance of him, I watched the Stanford course cs231n(Convolutional Neural Network) which gave me a good picture on how to start to learn deep learning. Also, during this project, I self-studied the deep learning library Caffe. At the begining, I encounted many difficulties as Caffe was quite a large framework written in C++, using many C++11/C++13 rules that I did not studied before. During the studying process, I could manage to understand the source code of Caffe and used it as the tool to train the neural networks.

Other Projects

This is a course project contains several parts. By default, the whole system is constituted by 10 machines(with one server, one standby server, one clinet and 7 workers).

  • First I implemented the failure detectors for the whole system, using the method of round failure detection which can tolerate up to 3 failures at the same time. This part asks each machine to send its memberlist to its neighbors while keep updating the heartbeat of itself.
  • Secondly, I implemented the distributed file system, especially the operations (PUT, GET, DELETE), for put, we allow each machine to send the local file to the DFS(distributed file system), and for get, we let each machine to download the specific file from the cloud storage system and for delete, each machine in the storing system can delete on distributed file from the system. We used the SHA for file hashing, which would allocate a machine for the firstly uploaded file, also we will ask the next two machines to store a replica for the file in order to handle the failure of machines. For the later put operation, we will find any 2 machines that kept the specific file and then make update on them. For get, we would find the machine that has the latest version and download from it.
  • For the last part of the whole system, I implemented the distributed graph system so that people can run some specific tasks on it, such as PageRank and SSSP(Singe Source Shortest Path). I used the standard pipelines Gather-Apply-Scatter for the DGPS, where in the system, each worker stores part of the graph nodes and their edges(here in my design, the partition is made by modular computation). Then in each round, every worker do calculation for the nodes stored by it and then send the updated informations to target machines. In the experiment comparing with Spark, for 20 rounds of PageRank and SSSP, my system takes 1/3 time of Spark to correctly finish the same tasks.

Here I have several course works in CS447 : Natural Language Processing.

  • Language Model with Smoothing : Implemented unigram language model with Laplace somoothing and bigram language model with absolute discounting smoothing and Kneser-Ney smoothing to generate simple English sentences.
  • Finite State Transducer : Implemented the finite state machine to transfer English verbs to their correct -ing form.
  • POS Tagger : Implemented a Hiden Markov Model to do the Part-of-Speech tagger tasks. Especially implemented the viterbi algorithm for the bigram HMM.
  • Context Free Grammer: Using python library NLTK and wrote the cfg file to specify rules for English grammer in order to parse English sentences.
  • PCFG : Probabilistic Context-Free Grammars, implemented CKY algorithms to do probabilistic parsing.
  • Naive Machine Translation Model : Implemented and trained the IBM Model I for word alignment and sentence translation. Using EM algorithm in training to get the probability for source word given target word.

Splendor is a famour board game among the university students. As we played this game many times during the semester, we came up with the idea to implement it in Java. This is totally a software project. The first version is the final project of CS242(Programming Studio) which is implemented in Java, in which I implemented the game logic, game loop, networking as well as the replay mode. The work was splited into 4 weeks, and in each week we wrote the codes and the tests for our program. After that, I deployed it into android, where I practiced my skill on writing the android layout code and also the controller of android activities and intents. After that, I also re-implemented the game in Kotlin, which was the new language anounced to support android.

This is a script wroten in Python to predict the S&P500 index of next day. The prediction model is trained by the LSTM network in Keras on Tensorflow. The traning data is the close price of S&P500 of each day, and every 23 days form a group where the first 22 days are the parameters and the 23th price is the groundtruth for the prediction. For now the model has a correctness of 80% on predicting the single day's change based on the previous 22 days' data. This is my first trial on using machine learning on finance, as this is something really interested to me, I would try more different models and other hypo-parameters in the near future.

This is the course project of CS412(Data Mining), in which we implemented a javascript extension of Chrome to extract information from Youtube. We implemented 3 different algorithms, including K-Means, Distant Comparing and Naive Bayes. Among these 3 methods, we found that the distance method and the Naive Bayes Model could give better results. Also we add the function for extract the visual blocks highlightened by our model to csv files which could be used for analysis in future.

Undergraduate Courses

CS241 System Programming

Introduction course of system programming, studied the concept of thread, process, synchronize, networking and file system. The assignments of this course were finished in C, including many interesting implementations such as malloc and shell.

CS242 Programming Studio

The programming studio course where I did many projects. The first project is implementing Chess in Java, including the game logic and gui. The second project is about scraping the film data from wikipedia, and then process as well as anaylze the data using python scripts The third project of the course is to write a webpage of our assignments, the front end was written by html, css in bootstrap and the backend was in python Flask. The final project of the course was the implementation of Splendor

CS241 Algorithms&Computation Models

Introduction course to algorithms and computational models, including regular expression, DFA, NFA, CFG, Turing Machine, Divide and Conquer, DP, Greedy, Graph algorithms and NP/NP-hard/NP-complete problems

CS412 Data Mining

Introduction course to data mining, including datawarehouse, pattern finding, clustering and classification.

CS421 Programming Language and Compiler

Introduce the functional programming language(Haskell), and many concepts including tail recurrsion, continue programming style and so on. Build a simple compiler using Haskell in the machine problems of this course

CS446 Machine Learning

Introduction course of Machine Learning, basic concepts including Perceptron, Winnow algorithms, Stochastic Gradient Descent algortihm, Kernel Methods, Support Vector Machine, Neural Network and also computational Machine Learning.

CS450 Numerical Analysis

The upper course of CS357(Numerical Method), topics including numerical algorithms for linear algebra, regression, optimization, difference/integral, normal/partial differential equations.

CS473 Theory 2

Advanced algorithm course, topics including advanced dynamic programming, graph theory, netflow algorithms, matching problems, linear programming and NP-hard problems.

Graduate Courses

CS425 Distributed System

More details of this course can be seen in the first project in Other Project section(which is a Distributed Graph Processing System). Talk about many concepts, algorithms and applications in the field of distributed system. From MapReduce, Failure Detection to Datacenters.

CS447 Introduction to Natural Language Processing

Discussed the traditional concepts and algorithms in Nature Language Processing, mainly about the statistical models of NLP, like HMM tag parser, PCLG parser, IBM Machine Translation model and so on.

CS598TEL Machine Learning Theory

The most difficult course I've taken in UIUC. Theories in machine learning, including representations(from linear seperator to neural network), optimization(like gradient descent and its convergence rate) and generalization(covering numbers, Radamecher Complexity, VC dimension). Did many proofs in this course.

Contact Me

(
)