Deep learning allows for fast design of reconstruction algorithms for high energy physics. I will discuss how a deep neural network architecture was designed to match the data hierarchies of jets. The novel method is integrated in the workflow of the CMS experiment and performance results obtained in simulation are shown. New results for top tagging with DeepJet in CMS will be presented for the first time. Beyond the performance impact in simulation I will discuss methods from the field of data science that can be used to address some of problems that arise from data and simulation differences. These methods could reduce systematic uncertainties and improve the performance in data.