- Refined Experimental Property Data
- Reliability-Proven Predicted Property Data
- Molecular Structure and Identifier
- Spectroscopic Data
- Quantum Chemical Data
- Molecular Descriptor (2-Dimensional) Data
- How to Obtain Chemical Big Data
Download
Full List of Chemical Big DataRefined Experimental Property Data
-
We provide refined experimental property data across thermo-physicochemical, thermodynamic, transport, and pharmaceutical properties, which serves as an important benchmark to cross-verify predictions and improve robustness. Below is the list of available refined experimental data
Property Name Number of
CompoundsNumber of
Data PointsConstant Properties Absolute Entropy of Ideal Gas at 298.15 K and 1 bar 1,864 1,864 Acentric Factor 1,857 1,857 Critical Compressibility Factor 1,355 1,355 Critical Pressure 3,157 3,157 Critical Temperature 2,668 2,668 Critical Volume 2,468 2,468 Dipole Moment 9,707 9,707 Electron Affinity 200 200 Enthalpy (Heat) of Formation for Ideal Gas at 298.15 K 1,960 1,960 Enthalpy (Heat) of Fusion at Melting Point 3,930 3,930 Flash Point 3,728 3,728 Gibbs Energy of Formation for Ideal Gas at 298.15 K and 1 bar 1,823 1,823 Heat (Enthalpy) of Vaporization at 298.15 K 2,875 2,875 Heat (Enthalpy) of Vaporization at Normal Boiling Point 1,210 1,210 Ionization Potential 6,079 6,079 Liquid Density at Normal Boiling Point 1,813 1,813 Liquid Molar Volume at 298.15 K 8,140 8,140 Lower Flammability Limit Temperature 1,067 1,067 Lower Flammability Limit Volume Percent 391 391 Magnetic Susceptibility 1,205 1,205 Net Standard State Enthalpy (Heat) of Combustion at 298.15 K 2,163 2,163 Normal Boiling Point 35,489 35,489 Parachor 965 965 Polarizability 380 380 Radius of Gyration 1,367 1,367 Refractive Index 60,974 60,974 Solubility Parameter at 298.15 K 1,494 1,494 Standard State Absolute Entropy at 298.15 K and 1 bar 1,105 1,105 Standard State Enthalpy (Heat) of Formation at 298.15 K and 1 bar 3,506 3,506 Standard State Gibbs Energy of Formation at 298.15 K and 1 bar 1,170 1,170 Upper Flammability Limit Temperature 1,421 1,421 Upper Flammability Limit Volume Percent 972 972 Melting Point 3,583 3,583 LogP (Octanol-Water Partition Coefficient) 29,199 29,199 LogS (Water Solubility) 7,443 7,443 Temperature Dependent Properties Heat Capacity of Ideal Gas 1,105 19,353 Heat Capacity of Liquid 919 17,093 Heat of Vaporization 1,371 35,941 Liquid Density 2,770 19,686 Second Virial Coefficient 485 14,829 Surface Tension 1,195 11,984 Thermal Conductivity of Gas 889 8,936 Thermal Conductivity of Liquid 1,480 18,084 Vapor Pressure of Liquid 2,939 41,257 Viscosity of Gas 1,256 12,557 Viscosity of Liquid 1,841 30,367 -
The experimental data was collected to validate the reliability of our QSQN technology. Due to frequent measuring errors in the raw experimental data, we implemented a systematic refinement process involving basic analysis, statistical filtering, and similarity analysis. For more details, see our refinement example using the normal boiling point as a case study.
Reliability-Proven Predicted Property Data
-
We provide thermo-physicochemical, thermodynamic, transport, and pharmaceutical property data for over 1 billion compounds as listed below.
Property Name QSQN Technology QN Technology Number of Compounds Average Accuracy(%)* Number of Compounds Average Accuracy(%)* Absolute Entropy of Ideal Gas at 298.15 K and 1 bar 4+ Million 95.90% 1+ Billion 98.67% Acentric Factor 4+ Million 93.70% 1+ Billion 94.45% Activity Score for GPCR Ligands 4+ Million N.APP. N.AVA. N.AVA. Activity Score for Ion Channel Modulators 4+ Million N.APP. N.AVA. N.AVA. Activity Score for Kinase Inhibitors 4+ Million N.APP. N.AVA. N.AVA. Activity Score for Nuclear Receptor Ligands 4+ Million N.APP. N.AVA. N.AVA. Critical Compressibility Factor 4+ Million 94.90% 1+ Billion 91.70% Critical Pressure 4+ Million 94.86% 1+ Billion 93.00% Critical Temperature 4+ Million 95.49% 1+ Billion 97.14% Critical Volume 4+ Million 94.31% 1+ Billion 97.15% Dipole Moment 4+ Million 92.20% N.AVA. N.AVA. Drug-Likeness 4+ Million N.APP. N.AVA. N.AVA. Electron Affinity 4+ Million 90.41% N.AVA. N.AVA. Enthalpy (Heat) of Formation for Ideal Gas at 298.15 K 4+ Million 96.15% 1+ Billion 90.15% Enthalpy (Heat) of Fusion at Melting Point 4+ Million 96.77% N.AVA. N.AVA. Flash Point 4+ Million 95.30% 1+ Billion 99.94% Ghose-Crippen Molar Refractivity 4+ Million N.APP. N.AVA. N.AVA. Ghose-Crippen Octanol-Water Partition Coeff. (logP) 4+ Million N.APP. N.AVA. N.AVA. Gibbs Energy of Formation for Ideal Gas at 298.15 K and 1 bar 4+ Million 95.04% N.AVA. N.AVA. Heat (Enthalpy) of Vaporization at 298.15 K 4+ Million 95.47% 1+ Billion 78.44% Heat (Enthalpy) of Vaporization at Normal Boiling Point 4+ Million 95.77% 1+ Billion 97.31% Heat Capacity of Ideal Gas 4+ Million 96.70% N.AVA. N.AVA. Heat Capacity of Liquid 4+ Million 99.28% N.AVA. N.AVA. Heat of Vaporization 4+ Million 98.67% N.AVA. N.AVA. Ionization Potential 4+ Million 95.41% N.AVA. N.AVA. Lipinski Alert Index 4+ Million N.APP. N.AVA. N.AVA. Liquid Density 4+ Million 99.09% N.AVA. N.AVA. Liquid Density at Normal Boiling Point 4+ Million 95.98% 1+ Billion 98.92% Liquid Molar Volume at 298.15 K 4+ Million 97.21% 1+ Billion 94.82% LogP (Octanol-Water Partition Coefficient) 4+ Million 96.72% N.AVA. N.AVA. LogS (Water Solubility) 4+ Million 96.28% 1+ Billion 86.61% Lower Flammability Limit Temperature 4+ Million 96.43% 1+ Billion 98.92% Lower Flammability Limit Volume Percent 4+ Million 95.08% 1+ Billion 66.33% Magnetic Susceptibility 4+ Million 94.68% 1+ Billion 85.64% Moriguchi Octanol-Water Partition Coeff. (logP) 4+ Million N.APP. N.AVA. N.AVA. Net Standard State Enthalpy (Heat) of Combustion at 298.15 K 4+ Million 95.87% 1+ Billion 86.94% Normal Boiling Point 4+ Million 95.02% 1+ Billion 99.69% Number of Acceptor Atoms for H-bonds (N,O) 4+ Million N.APP. N.AVA. N.AVA. Number of Donor Atoms for H-bonds (N,O) 4+ Million N.APP. N.AVA. N.AVA. Melting Point 4+ Million 84.90% 1+ Billion 83.08% Parachor 4+ Million 97.18% 1+ Billion 94.41% Polarizability 4+ Million 91.02% 1+ Billion 84.13% Radius of Gyration 4+ Million N.APP. N.AVA. N.AVA. Refractive Index 4+ Million 95.99% 1+ Billion 84.11% Second Virial Coefficient 4+ Million 91.94% N.AVA. N.AVA. Solubility Parameter at 298.15 K 4+ Million 95.88% 1+ Billion 98.94% Standard State Absolute Entropy at 298.15 K and 1 bar 4+ Million 97.49% 1+ Billion 94.52% Standard State Enthalpy (Heat) of Formation at 298.15 K and 1 bar 4+ Million 94.87% 1+ Billion 79.53% Standard State Gibbs Energy of Formation at 298.15 K and 1 bar 4+ Million 94.14% N.AVA. N.AVA. Surface Tension 4+ Million 93.59% N.AVA. N.AVA. Thermal Conductivity of Gas 4+ Million 90.08% N.AVA. N.AVA. Thermal Conductivity of Liquid 4+ Million 91.82% N.AVA. N.AVA. Upper Flammability Limit Temperature 4+ Million 94.22% 1+ Billion 96.66% Upper Flammability Limit Volume Percent 4+ Million 95.43% 1+ Billion 92.96% van der Waals Area 4+ Million N.APP. N.AVA. N.AVA. van der Waals Reduced Volume 4+ Million N.APP. N.AVA. N.AVA. Vapor Pressure of Liquid 4+ Million 97.86% N.AVA. N.AVA. Viscosity of Gas 4+ Million 98.37% N.AVA. N.AVA. Viscosity of Liquid 4+ Million 89.62% N.AVA. N.AVA. -
The property data listed above is produced using our proprietary QSQN and QN technology, backed by 41 patents, and has been validated to demonstrate proven reliability. The compounds, composed of C, H, N, O, S, F, Cl, Br, I, Si, P, and/or As, span a wide range of structures and compositions.
Molecular Structure and Identifier
-
We offer comprehensive molecular structures and identifiers for over 1 billion compounds as listed below.
Structure & Identifier Number of Compounds Optimized 3D Structure Data (Mol) File 4+ million 3D Structure Data (Mol) File 1+ Billion 2D Structure Data (Mol) File 1+ Billion SMILES String 1+ Billion InChI 1+ Billion InChIKey 1+ Billion -
The optimized 3D structures are produced using a high-quality quantum chemical computation process. We perform conformer analysis to select the lowest energy structure, followed by geometry optimization with the DFT-B3LYP functional and 6-31G basis set. All optimized structures are verified to ensure the absence of imaginary frequencies.
Spectroscopic Data
-
We provide comprehensive spectroscopic data for over 4 million compounds, with each and every compound featuring key information for molecular identification and analysis. The following spectroscopic data is available:
-
-
Infrared (IR) Spectroscopy:
Detailed vibrational frequencies and intensities essential for identifying functional groups and molecular structures. IR spectroscopy data is available in JDX (JCAMP Chemical Spectroscopic Data Exchange Format) as well, ensuring easy integration into various analysis tools and workflows.
-
Nuclear Magnetic Resonance (NMR) Spectroscopy:
¹H, ¹³C, ¹⁵N, ¹⁷O, and ³²S NMR data, providing insights into molecular environments, bonding interactions, and chemical connectivity.
-
Vibrational Circular Dichroism (VCD) Spectroscopy:
Provides chiral-sensitive data that enhances molecular structure determination, particularly useful for studying the stereochemistry of organic compounds.
-
Infrared (IR) Spectroscopy:
-
These datasets are derived from high-quality quantum chemical computations as outlined in our proprietary QSQN technology.
Quantum Chemical Data
-
We provide an extensive collection of quantum chemical data for over 4 million compounds, critical for developing AI models in areas like molecular design, drug discovery, and material science. The data listed below represents only a subset of the complete data (Download full data list).
-
Quantum Chemical Computation Data
-
Vibrational Frequency Data:
Provides key insights into molecular vibrations, essential for understanding molecular dynamics.
-
Total Energy:
Critical for determining the stability and potential reactivity of a molecule.
-
Molecular Orbital Energies (HOMO, LUMO):
Important indicators of a molecule's ability to donate or accept electrons, useful for predicting chemical reactivity.
-
Mulliken Charges, Cartesian Coordinates, and Force Constants:
Important computational outputs used for electronic structure analysis and geometric optimization.
-
Vibrational Frequency Data:
-
Quantum Chemical Descriptors (3-Dimensional Molecular Descriptors)
-
HOMO-LUMO Energy Gap:
A widely used metric for predicting molecular stability and chemical reactivity.
-
Nucleophilic and Electrophilic Reactivity Indices:
Critical for understanding how a molecule interacts with other species in nucleophilic (electron-donating) and electrophilic (electron-accepting) reactions.
-
Atomic Charges and Molecular Dipole:
Key indicators of charge distribution within the molecule, influencing how it interacts in chemical and biological environments.
-
HOMO-LUMO Energy Gap:
-
Electrostatic Descriptors (3-Dimensional Molecular Descriptors)
Electrostatic properties are crucial for understanding a molecule’s behavior in different environments, such as solvent interactions:
-
Max and Min Partial Charges:
Provide insight into the distribution of charge across a molecule, helping predict areas of electron density.
-
Polarity Parameter:
Measures the overall polarity of a molecule, which is important for predicting solubility and molecular interactions.
-
Surface Area Metrics:
Includes Total Molecular Surface Area, Partial Positive and Negative Surface Areas, and Charge-Weighted Surface Areas—essential for understanding molecular interactions like binding affinity.
-
Max and Min Partial Charges:
-
Quantum Chemical Computation Result File
-
FCHK File:
A file that contains comprehensive quantum chemical computation results, providing a complete dataset for each molecule that can be further analyzed or integrated into AI models.
-
FCHK File:
Molecular Descriptor (2-Dimensional) Data
-
Molecular descriptors mathematically represent the properties of molecules and play a crucial role in improving the performance of AI models, enabling AI systems to more accurately understand and predict the structural, chemical, and geometrical characteristics of molecules in fields such as chemical research, material science, and drug design
-
We offer over 3,000 2-dimensional molecular descriptors for more than 1 billion compounds, organized into 20 distinct categories, providing detailed insights into molecular behavior for AI-driven research. Below are the descriptor categories and the corresponding number of available descriptors.
Descriptor Name No Constitutional Descriptors 43 Ring Descriptors 32 Topological Descriptors 75 Walk And Path Counts 46 Connectivity Indices 37 Information Indices 48 2D Matrix-Based Descriptors 550 2D Autocorrelation Indices 213 Burden Eigenvalue Descriptors 96 P_VSA-Like Descriptors 45 Descriptor Name No ETA Indices 23 Edge Adjacency Indices 324 Functional Group Counts 155 Atom-Centred Fragments 115 Atom-Type E-State Indices 169 CATS 2D Descriptors 150 2D Atom Pair Descriptors 1596 Charge Descriptors 15 Molecular Properties 20 Drug-Like Indices 27 -
Each descriptor category provides a unique perspective on the molecular structure and properties. For a detailed list of descriptors within each category, a separate file is available for download (Download full data list).
How to Obtain Our Chemical Big Data
-
We understand that each customer’s data needs are unique. We therefore offer the flexibility to choose only the specific data you need from our 30+ billion datasets, ensuring a fully customized solution tailored to your exact requirements.
-
Simply contact us at contact@cc-dps.com and let us know the type of data and compounds you need, the quantity of data points or compounds, and your preferred format for receiving the data (e.g., as a file, integrated database, API, or other specific formats). Once we receive your inquiry, we will provide you with a tailored quote based on your specific requirements.
-
Feel free to reach out to us anytime—we’re here to help you get the data you need for your AI development.