Yuanqi Liu

Beijing University of Chemical Technology
B.S.:Chemical Engineering

Research Experience
2017. 12 - Present


 Research on the performance of
fuel cell ruthenium catalyst

Organic Systhesis

 Thioether-facilitated iridium-catalyzed hydrosilylation of steric
1,1-disubstituted olefins

Medicine Data Analysis
2019.12~ present

 Similarity analysis based on structural information of
pharmaceutical compounds

Electrochemistry | 2017.12~2018.06

Research on the performance of fuel cell ruthenium catalyst

 Abstract:  In the field of fuel cell research, the use of catalysts to catalyze electrode reaction with higher efficiency and better
performance has attracted attention around the world. Therefore, the Ru-C catalyst was found to be more effective than the Pt-C
catalyst based on electrocatalytic activity. In order to further improve the catalytic activity of ruthen-based catalysts, we introduced
four metals and ruthenium to form a core-shell structure to increase the ruthenium surface area. It was found that the Ni-Ru
catalysis was the most effective, and new Ni-Ru synthesis methods are being improved.​
• Responsible for designing the process of experiments, conducting experiments, processing data and writing   the final paper.
• Polish and pretreat the electrode surface. Syethesize the catalysts of Pt-c, Ru-c, Ni-Ru etc. and single metal    catalyst. Observe the core-shell structure of the metal catalyst under electron microscope.
• Assemble the electrode device, configure the acidic and alkaline solutions with different concentrations,         control the experimental 
conditions of the electrode device (electrode speed, gas environment, scanning rate). Conduct electrochemical characterization with VersaSTAT 3, and record the CV at regualr intervals.        
• Software Origin was used to calculate the CV curve integral area of different synthetic metal electrodes under different conditions. 

● Cyclic voltammetry (CV) is used in this research. By controlling the electrode potential at different rates, the electrode is scanned
repeatedly over time. By setting the current of cyclic scanning, the oxidation and reduction reactions occur alternately on the
electrode, and the current ‐ potential curve is generated. The electrolyte solution is CuSO4, and the voltage is applied to the
electrode to make the Ru atom adsorb the Cu atom, which causes the potential change near the electrode. At this time, the
electrocatalytic activity (ECSA) can be obtained by the integral area.
● Under acidic (HClO4) and alkaline (KOH) conditions, Pt-C electrode was set to determine CV and LSV at different speeds
(1600rpm,400rpm), scanning rate (10mv/s,50mv/s) and different gas environments (H2, O2, Ar).
● Ru-C, Ru-Ni, Ru-Co, Ru-Fe, Ru-Cu electrodes were set to determine CV and LSV at different speeds, scanning rates and different gas
environments under acidic and alkaline conditions.
● From the results, the catalytic activity of Ru-C catalyst under alkaline conditions is much higher than that of Pt‐C electrode, which is
equivalent to about nine times. Meanwhile, from the CV curves of the four core-shell structures, it can be seen that the Ni (vacuum drying)
group has the highest activity, followed by Ni, Cu, Fe and Co. Ni-Ru is highly active, but not as stable as Pt‐C.
metals CV curve
The core-shell structures of four metals were observed under electron microscope. Most of Ni in Ni-Ru has been reduced and
initially formed a hexagonal structure. Many particles have obvious interlayer and are partially uniform. A large number of small
particles were distributed in Co-Ru, and the particles were relatively uniform, but no obvious intercalation was observed. Only a
small amount of Fe was reduced in Fe-Ru, and the core-shell structure formed by Fe was small, and the particles were small and
large. The particles in Cu-Ru are relatively uniform, and part of them form the core-shell structure, but the core is less and smaller,  which can be regarded as an alloy rather than as a core-shell result.
                 Ni-Ru core-shell sturcture                                                                                                                         Co-Ru core-shell sturcture
                 Fe-Ru core-shell sturcture                                                                                                 Cu-Ru core-shell sturcture                                                  
● Conclusion: the catalytic activity of Ru-C under alkaline conditions (expressed as Ik/ECSA) is higher than that of Pt-C, which is a good substitute. Among the four metal core-shell catalysts, the activity of Ni-Ru was the best and the activity of Ni-Ru was higher
after vacuum drying.
● Future: Ru-C is not as stable as Pt-C (especially the core-shell structure, which has serious active attenuation), so the stability of   Ru-C can be further studied in the Future. Moreover, the catalytic activity of Ni-Ru is higher than that of samples without vacuum
drying, which may lead to the purification of catalyst or the change of particle structure of catalyst in the process of vacuum drying,   so as to increase the catalytic sites and improve the catalytic activity.

Organic Synthesis | 2018.08~2019.09

Thioether-facilitated iridium-catalyzed hydrosilylation of steric 1,1-disubstituted ol-efins

Abstract: We confirms a hydrosilation reaction with simple and mild reaction conditions. By screening the terminal olefins                containing sulfide bonds, the hydrosilation reaction is performed under the action of the iridium catalyst. The thioether group plays a key role in promoting iridium catalysis. This method was successfully used for the addition of silane and terminal olefins and           proposed a reaction mechanism involving Ir-H.
    European Journal of Organic Chemistry )
• Responsible for conducting experiments with my team members, product processing, data processing and finally published
•The corresponding amount of sulfur-terminated alkenes, silane and different types of iridium catalysts were added to different
solvents such as DCM and MECN and fully stirred for 4h at room temperature
• The target products were obtained by column chromatography and the yield was calculated by NMR. Using the (Ir (COD)Cl)2
as catalyst, silane PhMe2SiH as substrate, and 1, 4-dioxane as solvent for subsequent experiments.
• The above conditions were repeated by changing the side chain of the endolene, and the yield was calculated by NMR after each

● The normal alkene-silylation has a resistance. In the hydrogenation of alkenes, the efficiency and selectivity of the reaction are
improved by introducing guiding groups in the alkene substrates. Iridium catalyzed the hydrogenation and silication of an
unactivated alkene with thioether as the guiding group.
● Through the control experiment, the necessity of promoting this process was explained: enyl (1.0 eq.), silane (1.5 eq.) and catalyst (2 mol %) were mixed in solvent, stirred for 4 h at room temperature, and the yield was calculated using dimethyl sulfone as
internal standard in NMR.
● PhMe2SiH silane was substrates with 1,1-disubstituted alkenes containing sulfur. The yield was determined by iridium catalyst at   room temperature and NMR. Using catalyst (Ir (COD) Cl)2, silane PhMe2SiH yield was the highest (solvent 1, 4-dioxane-95%), the process was completed within 2 hours, and trace by-products were observed.
● Under the selected catalyst and reaction conditions, the yield was measured by the modification of the contralateral chain group.   The yield of alkyl allyl sulfide is high. The yield of aryl allyl sulfide was better. The yield of allyl sulfide decreased slightly.
plausible reaction mechanism
● Conclusion: a simple and mild sulfur-induced hydrosilylation reaction is demonstrated. The thioether group plays a key role in
promoting this iridium catalysis. The method was successfully applied to the polymerization of disilane and diene.
● Future: further applications of the scheme for more hydrosilylation, polymerization and modification are under way.

Medicine Data Analysis | 2019.12~Present

Similarity analysis based on structural information of pharmaceutical compounds

  [As the project leader, plan the project, collect compound information, process data and analyse results ]
Keywords: Similarity analysis, machine learning, data visualization, medical data analysis
  • Choose different molecular descriptors and algorithms to build diverse models to predict drug similarity
  • Combining key feature screening methods to obtain specific substructures
  • Further divide the drug sub-categories to accurately find the key structure of drug efficacy 
• Huge amount of data can be obtained from drug compounds, even though the formats of them are significantly diversified and
inconsistent, which makes it hard to conduct related data analysis by well-developed software tools in the area of signal processing   and pattern recognition. At the meantime, data estimation software has been developed for chemical research, by which chemical
and physical properties of a compound can be estimated for its characterization.
In this work, in order to find unique substructures with specific drug effects, an analysis is conducted on the drugs which are           pre-classified according to ATC(Anatomical Therapeutic Chemical). Among the fourteen defined categories, Suitable machine          learning algorithms, like Kmeans clustering , hierarchical clustering, genetic algorithm, random forest, and Naive Bayesian               classification, are selected. The categories are further subdivided into multiple subcategories according to similarity. For each drug   category that is further classified, find a unique drug effect structure applicable to each sub-category to help drug discovery.
Methodology and Detail
Initial Stage  2019.12~2020.06
• Firstly, according to the efficacy of the drugs, initially we pre-selected a total of 463 compounds from seven kinds of drugs for       preliminary screening. The source of the properties is the online website SwissADME and the thermodynamic property prediction   software ProCAPE. The example of compounds estimated properties from SwissADME and SwissADME are shown in Figure 1     and Figure2.
 Figure 1. Predicted properties of Ibuprofen in SwissADME
Figure 2. Predicted properties of Glymidine in SwissADME
• We performed an indiscriminate analysis using scatter matrix for all estimated properties of all drugs in ORIGIN, which was used   to verify whether all properties are valid. The principle of preliminary screening is: in the scatter matrix chart, there should be at     most only one property with linear correlation. In the end, 7 properties were screened in SwissADME, and 5 properties were            screened  in ProCAPE. The scatter matrix used for final screening is marked in Figure 3. 
Figure 3. Scatter matrix among properties
• Label the 463 drugs and use the t-SNE method for dimensionality reduction visualization, the result is shown in Figure 4.
Figure 4. t-SNE visualization
The drug properties are hierarchically clustered and output in the form of a tree diagram. The tree diagram is shown in Figure 5.     The classification results are summarized in Table 1.
Figure 5. Hierarchical clustering tree diagram
Table 1. Hierarchical clustering results

Then randomly select 20% of each type of drug as the test set, and the remaining 80% as the training set. The output accuracy of the SVM test set is 72.6%.


●Current Conclusion:
From the results, the drug similarity obtained through visualization and clustering is more obvious in some types of drugs, but the    overall degree is still lacking. Therefore, we tried to change the direction of thinking, divide the drugs into multiple sub-categories,   and try different algorithms.
Second Stage  2020.07~present
• In order to further classify drugs and expand the existing data set, drugs will be pre-divided into 14 categories based on ATC, and   each category will be subdivided into multiple subcategories. Choose the most suitable model in each category and find                    characteristic structures to help drug development.
(The 14 categories of ATC include Alimentary tract and metabolism;Blood and blood forming organs;Cardiovascular system;Dermatologicals;Genito-urinary system and sex hormones;Systemic hormonal preparations, excluding sex hormones and insulins;Antiinfectives for systemic use;Antineoplastic and immunomodulating agents;        Musculo-skeletal system;Nervous system;Antiparasitic products, insecticides and repellents;Respiratory system;Sensory organs;Various)
• Combine genetic algorithm with Kmeans to use the classification accuracy of Kmeans to find the best feature set under the set dimension. Taking antithrombotic and anti-bleeding drugs as samples, 755 features obtained by ChemDes are used as the data set. The      initial data set has 755 features, and the threshold of variance is set to 0.01. After deleting the features with variance less than 0.01, a data set with 581 features is obtained.
Condition 1:
The number of iterations is 100, and the initial population number is 200.
Results: The best classification accuracy obtained is 0.80952 and the results are shown in Table 2.
The running time becomes longer, but the classification accuracy obtained does not improve. The operation is not continued.
Table 2. condition 1 operation result
Condition 2:
The number of iterations is 300, the initial population number is 500. The final feature dimension is set to 1.
Result: IC6 is the best feature obtained in operation.
Condition 3:
The number of iterations is 150, the initial population is 600, the final feature dimension is set to 1.
Result: ATSe3 is the best feature obtained in operation
The result of manually deleting features before is that ATSe4 and ATSe5 have obtained higher classification accuracy, and ATSe4 has obtained a classification accuracy of 0.854.Sperman correlation coefficient is used to analyze the correlation between ATSe3, ATSe4, and ATSe5, and the results obtained are as follows. It can be found that ATSe3 and ATSe4 have a strong trend of change.
●Although there have been many studies comparing the predictions of various models for the 14 categories of ATC, there are few     more detailed divisions or structural differences between these types of drugs. We are trying to find more convincing methods to       solve key features to help obtain key descriptors and key fingerprint locations.
Therefore, we supplement the anti-thrombotic and anti-bleeding drugs in the KEGG database with the previously collected data,       classify the anti-thrombotic and anti-bleeding small molecule drugs in KEGG, and conduct training and testing.
The previous data included 48 antithrombotic drugs and 13 anti-bleeding drugs. According to the KEGG database, four                     antithrombotic drugs were added. The 13 anti-thrombotic drugs and 10 anti-hemorrhage drugs of KEGG are used as the training set data, and the remaining 39 anti-thrombotic drugs and 3 anti-hemorrhage drugs are used as the test set data, and the Kmeans                algorithm is used for classification.
The classification accuracy of the training set is 0.80952, and the classification accuracy of the test set is 0.782608.
According to the correlation coefficient to solve the correlation between the variables, the feature with the largest amount of             information in the strong correlation feature group is retained, and the remaining features are deleted, so as to obtain the best             feature subset. In this way, the feature with the largest amount of information is successfully retained, and the relevant feature is       deleted.
The original feature set has 755 features. After deleting the features with small variance, 581 features remain. The classification        accuracy based on the data of 581 features is 0.803278. After deleting the features with strong feature correlation, 341 features are    obtained, and the classification accuracy obtained is 0.852459. Obviously, after screening the correlation features, the classification  accuracy has been significantly improved.
Compared with the results of molecular descriptors, the results obtained from fingerprints are slightly better. Taking the classification of antithrombotic and anti-bleeding drugs as an example, the classification accuracy of molecular descriptors is 0.80327, and the       classification accuracy of fingerprints is 0.81967.
●Plan to calculate the similarity of the drug adjacency matrix to judge the similarity of the two drugs, convert the matrix to a vector, or count the proportion of the common part of the two matrices to judge whether the two matrices are similar, and then combine the key feature screening method to get Specific substructure. For large data sets, we may choose to use positive and negative data sets   to form a new data set, and solve the substructure through a similar binary classification method.
●Plan to implement hierarchical clustering, random forest, and naive Bayes model to help classify all current drugs and compare       classification results
(This work is constantly being updated every week.)