I believe that the future of AI is not replacing humans, but, rather, better supporting humans with automated intelligences. Hence, my career goal is the creation of "human in the loop" machine learning environments.

PhD Computer Science North Carolina State University Aug, 2015 - 2020 (expected)
MS Control Science and Egineering Shanghai Jiaotong University Sep, 2011 - Mar, 2014
BS Automation Shanghai Jiaotong University Sep, 2007 - Jun, 2011
Graduate Research Assistant, The SeBIG Lab, LexisNexis and NCSU. Aug, 2015 - Now

Member of a new lab, called “SeBig” (Software Engineering for Big Data), established as joint research collaboration between LexisNexis and NC State. Working with two other graduate students on validation methods for Big Data applications in large-scale industrial data, using Natural Language Processing, Deep Learning, and more.

PhD Scholar, The RAISE Lab, Department of Computer Science, NCSU. Aug, 2015 - Now

Working as a member of RAISE Lab, following the instruction of Dr. Menzies. My primary research is to apply machine learning algorithms to support human retrieve desired information from big data with less effort.

  • Developed a method called FASTREAD to support fast selection of primary studies in systematic reviews.
  • A tool has been developed to implement FASTREAD, which can be found at https://github.com/fastread/src.
  • Same idea applied to solve software security vulnerability prediction problem and test case prioritization problem.
Quantizing Investments of Stock Index Futures with Machine Learning, Shanghai Jiaotong University Mar, 2014 - Aug, 2014

Conducted several experiments on real life stock index futures data under the supervision of Dr. Yuan.

  • Established a feature selection scheme of Stock Index Futures with low-rank approximation and sparse representation.
  • Realized an online, quantizing investment algorithm with reinforcement learning.
Disturbance Observer Based Control on Multi-variable Plants, Shanghai Jiaotong University Feb, 2011 - Mar, 2014

Part of a project granted by National Natural Science Foundation of China. Working as a research assistant in the RCIR Lab. Directed by Dr. Su.

  • Established a sufficient condition for the closed-loop robust stability of a disturbance observer-based multi-variable control system.
  • Proposed a systematic design procedure of the multi-variable disturbance observer.
  • Validated the efficacy of control method through experiments on a quadrotor system.
The Design and Construction of a Plug-and-play Mobile Robot System, Shanghai Jiaotong University Mar, 2014 - Aug, 2014
  • Designed the operation interface of the robot system so that the control computer can operate a group of mobile robots simultaneously.
  • Established a multi-platform communication through socket programming to facilitate the plug-and-play.
Software Engineering Intern, Google, Mountain View May, 2018 - Aug, 2018
  • Found representative images for each entity, by averaging their starburst embeddings.
  • Trained a dual encode model, between entity and image starburst.
  • Designed and tried different metrics to evaluate the model performance.
  • Added a feature to dual encoder framework to support dense feature.
Software Engineer, LexisNexis, Raleigh May, 2017 - Aug, 2017
  • Constructed the software architecture for Python tasks on Amazon Web Service.
  • Got great experiences on AWS Lambda function, S3, EMR, Apache Spark, Livy, Flask.
  • Implemented several machine learning algorithms, e.g. paragraph vector, latent dirichlet allocation, named entity recognition.
Software Engineer, LexisNexis, Raleigh May, 2016 - Aug, 2016
  • Created a sandbox for prototyping new DiscoveryIQ features. (Python + JS + ElasticSearch)
  • Developed new feature, which is called "Open the blackbox", of DiscoveryIQ.
  • Incorporate new feature into current DiscoveryIQ product. (Scala + Spark)
Engineer, Technical Department of NEW BRP, Beijing Aug, 2014 - July, 2015
  • Finished the whole process of producing a motor control center, including assembling, wiring and debugging.
  • Took part in the project of improving motor control performance with disturbance observer.


  • Yu, Z., Menzies, T., 2019. "FAST2: An intelligent assistant for finding relevant papers. Expert Systems with Applications." 120: 57-71. https://arxiv.org/abs/1705.05420.pdf
  • Yu, Z., Kraft, N.A. and Menzies, T., 2018. "Finding Better Active Learners for Faster Literature Reviews. Empirical Software Engineering." https://arxiv.org/pdf/1612.03224.pdf
  • Yu, Z., Menzies, T., 2017. "Data Balancing for Technologically Assisted Reviews: Undersampling or Reweighting." In: CLEF 2017 Working Notes. CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073,http://ceur-ws.org/Vol-1866/ (2017) http://ceur-ws.org/Vol-1866/paper_120.pdf
  • Nair, V., Yu, Z., Menzies, T., Siegmund, N. and Apel, S., 2018. "Finding Faster Configurations using FLASH." IEEE Transaction on Software Engineering. https://arxiv.org/abs/1801.02175
  • Krishna, Rahul, Zhe Yu, Amritanshu Agrawal, Manuel Dominguez, and David Wolf. "The BigSE project: lessons learned from validating industrial text mining." In Proceedings of the 2nd International Workshop on BIG Data Software Engineering, pp. 65-71. ACM, 2016. http://dl.acm.org/citation.cfm?id=2896836
  • Zhe Yu, Jianbo Su. ``Robust Disturbance Observer Based Control for Multi-variable Systems'', IFAC LSS 2013, July 7-9, 2013, Shanghai, China http://www.ifac-papersonline.net/Detailed/60939.html
  • Zhe Yu, Lu Wang, Jianbo Su.``Disturbance Observer Based Control for Linear Multi-variable Systems with Uncertainties'', Acta Automatica Sinica, 2014, 40(11): 2643-2651.

Under review

  • Yu, Z., Theisen, C., Williams, L., and Menzies, T., 2018. "Improving Vulnerability Inspection Efficiency Using Active Learning." arXiv preprint arXiv:1803.06545. https://arxiv.org/pdf/1803.06545.pdf