Mediclin Clinical Research

Biostatistics and data analysis (SAS)

What is Biostatistics and data analysis (SAS)

Biostatistics is a program that focuses on the application of descriptive and inferential statistics to biomedical research and clinical, public health, and industrial issues related to human populations

Biostatistics (also known as biometry) are the development and application of statistical methods to a wide range of topics in biology. It encompasses the design of biological experiments, the collection and analysis of data from those experiments and the interpretation of the results.

Let's discuss some history of Biostatistics:

Biostatistical modeling forms an important part of numerous modern biological theories. Genetics studies, since its beginning, used statistical concepts to understand observed experimental results. Some genetics scientists even contributed with statistical advances with the development of methods and tools. Gregor Mendel started the genetics studies investigating genetics segregation patterns in families of peas and used statistics to explain the collected data. In the early 1900s, after the rediscovery of Mendel’s Mendelian inheritance work, there were gaps in understanding between genetics and evolutionary Darwinism. Francis Galton tried to expand Mendel’s discoveries with human data and proposed a different model with fractions of the heredity coming from each ancestral composing an infinite series. He called this the theory of “Law of Ancestral Heredity”. His ideas were strongly disagreed by William Bateson, who followed Mendel’s conclusions, that genetic inheritance were exclusively from the parents, half from each of them. This led to a vigorous debate between the biometricians, who supported Galton’s ideas, as Walter Weldon, Arthur Dukinfield Darbishire and Karl Pearson, and Mendelians, who supported Bateson’s (and Mendel’s) ideas, such as Charles Davenport and Wilhelm Johannsen. Later, biometricians could not reproduce Galton conclusions in different experiments, and Mendel’s ideas prevailed. By the 1930s, models built on statistical reasoning had helped to resolve these differences and to produce the neo-Darwinian modern evolutionary synthesis.

Solving these differences also allowed to define the concept of population genetics and brought together genetics and evolution. The three leading figures in the establishment of population genetics and this synthesis all relied on statistics and developed its use in biology.

There are a lot of tools that can be used to do statistical analysis in biological data. Most of them are useful in other areas of knowledge, covering a large number of applications (alphabetical). Here are brief descriptions of some of them:

ASReml : Another software developed by VSNi that can be used also in R environment as a package. It is developed to estimate variance components under a general linear mixed model using restricted maximum likelihood (REML). Models with fixed effects and random effects and nested or crossed ones are allowed. Gives the possibility to investigate different variance-covariance matrix structures.

CycDesigN : A computer package developed by VSNi that helps the researchers create experimental designs and analyze data coming from a design present in one of three classes handled by CycDesigN. These classes are resolvable, non-resolvable, partially replicated and crossover designs. It includes less used designs the Latinized ones, as t-Latinized design.

Orange : A programming interface for high-level data processing, data mining and data visualization. Include tools for gene expression and genomics.

R : An open source environment and programming language dedicated to statistical computing and graphics. It is an implementation of S language maintained by CRAN. In addition to its functions to read data tables, take descriptive statistics, develop and evaluate models, its repository contains packages developed by researchers around the world. This allows the development of functions written to deal with the statistical analysis of data that comes from specific applications. In the case of Bioinformatics, for example, there are packages located in the main repository (CRAN) and in others, as Bioconductor. It is also possible to use packages under development that are shared in hosting-services as GitHub.

SAS : A data analysis software widely used, going through universities, services and industry. Developed by a company with the same name (SAS Institute), it uses SAS language for programming.

PLA 3.0 : Is a biostatistical analysis software for regulated environments (e.g. drug testing) which supports Quantitative Response Assays (Parallel-Line, Parallel-Logistics, Slope-Ratio) and Dichotomous Assays (Quantal Response, Binary Assays). It also supports weighting methods for combination calculations and the automatic data aggregation of independent assay data.

Weka : A Java software for machine learning and data mining, including tools and methods for visualization, clustering, regression, association rule, and classification. There are tools for cross-validation, bootstrapping and a module of algorithm comparison. Weka also can be run in other programming languages as Perl or R.

What is SAS biostatistics?

SAS/STAT includes exact techniques for small data sets, high-performance statistical modeling tools for large data tasks and modern methods for analyzing data with missing values.

SAS is a tool for analyzing statistical data. SAS is an acronym for statistical analytics software. The main purpose of SAS is to retrieve, report and analyze statistical data. Each statement in SAS environment ends with a semicolon otherwise the statement will give an error message.

Statistical Analytical System is a tool developed for advanced analytics and complex statistical operations. It is used by large scale organizations and professionals due to its high reliability.

SAS is easy to learn and provides easy option (PROC SQL) for people who already know SQL. Even otherwise, it has a good stable GUI interface in its repository. In terms of resources, there are tutorials available on websites of various university and SAS has a comprehensive documentation.

Research Planning

Any research in life sciences is proposed to answer a scientific question we might have. To answer this question with a high certainty, we need accurate results. The correct definition of the main hypothesis and the research plan will reduce errors while taking a decision in understanding a phenomenon. The research plan might include the research question, the hypothesis to be tested, the experimental design, data collection methods, data analysis perspectives and costs evolved. It is essential to carry the study based on the three basic principles of experimental statistics : randomization, replication, and local control.

Despite the fundamental importance and frequent necessity of statistical reasoning, there may nonetheless have been a tendency among biologists to distrust or deprecate results which are not qualitatively apparent. One anecdote describes Thomas Hunt Morgan banning the Friden calculator from his department at Caltech, saying “Well, I am like a guy who is prospecting for gold along the banks of the Sacramento River in 1849. With a little intelligence, I can reach down and pick up big nuggets of gold. And as long as I can do that, I’m not going to let any people in my department waste scarce resources in placer mining.”

What is a Bioequivalence BA/BE study?

BA/BE (Bioavailability & Bioequivalence) Studies. Bioequivalence is a term in pharmacokinetics used to assess the expected in vivo biological equivalence of generic version to its proprietary version of a drug or formulations of innovator drug in different clinical trial phases.
Objectives of Bioavailability studies : During primary stages of development of suitable dosage forms of new drug entity. Determination of influence of excipients , patient related factors & possible interaction with other drugs on the efficiency of absorption. Development of new formulations of existing drugs

Which drug has highest bioavailability(Bioequivalence)?

Drug: Morphine -~30% Bioavailability because 70% is metabolized via 1st pass effect if taken orally. Morphine is therefore usually given via s.c. injection to bypass this mechanism.

Example of bioavailability (Bioequivalence BA/BE Study)
Bioavailability is (1) the fraction of an administered dose of a drug that reaches the systemic circulation as intact drug (expressed as F) and (2) the rate at which this occurs. … For example, orally administered morphine has a bioavailability of about 25 percent due to significant first-pass metabolism in the liver.
Absolute bioequivalence
What does it means¿ Well, let’s check it out below: Absolute bioavailability refers to amount of the drug available to the body or system. This is measured as a ratio between the AUC after intravenous administration and AUC oral administration. It should be a figure less than 1 since it is assumed that 100% of the drug is available to the body after iv administration.

Also let's know what is absolute bioavailability (Bioequivalence):

“Absolute” bioavailability is the amount of drug from a formulation that reaches the systemic circulation relative to an intravenous (IV) dose, since you are injecting the drug directly into the systemic circulation.

What is needed for good bioavailability(Bioequivalence)?

The most reliable measure of a drug’s bioavailability is AUC. AUC is directly proportional to the total amount of unchanged drug that reaches systemic circulation. Drug products may be considered bioequivalent in extent and rate of absorption if their plasma concentration curves are essentially superimposable.

Now you might be thinking what is AUC in bioequivalence?

Let’s check it out : The area under the blood concentration versus time curve (AUC) • the maximum blood concentration (Cmax). Bioequivalence. If two drugs are bioequivalent, there is no clinically significant difference in their bioavailability.

How is AUC calculated in bioequivalence?

The AUC can be computed by adjusting the values in the matrix so that cells where the positive case outranks the negative case receive a 1 , cells where the negative case has higher rank receive a 0 , and cells with ties get 0.5 (since applying the sign function to the difference in scores gives values of 1, -1, and 0.


A Good AUC number?

The area under the ROC curve (AUC) results were considered excellent for AUC values between 0.9-1, good for AUC values between 0.8-0.9, fair for AUC values between 0.7-0.8, poor for AUC values between 0.6-0.7 and failed for AUC values between 0.5-0.6.

What is bioequivalence study design?

A bioequivalence study compares the bioavailability between a test and a reference drug product in terms of the rate and extent of drug absorption. The experimental design of a bioequivalence study is usually a crossover and rarely a parallel or a paired comparative.

Why do we need bioequivalence studies?

Bioequivalence studies are very important for the development of a pharmaceutical preparation in the pharmaceutical industry. Their rationale is the monitoring of pharmacokinetic and pharmacodynamic parameters after the administration of tested drugs.
Factors which influence bioavailability :
1) Drug concentration at site of administration.
2) Surface area of the absorptive site.
3) Drug pKa.
4) Drug molecule size.
5) pH of the surrounding fluid.
The pKa of a drug is the hydrogen ion concentration (pH) at which 50% of the drug exists in its ionized hydrophilic form (i.e., in equilibrium with its un-ionized lipophilic form). All local anesthetic agents are weak bases. At physiologic pH, the lower the pKa the greater the lipophilicity.

Conclusion of Bioequivalence BA/BE Study:

Bioequivalence between test and reference formulations, both in terms of rate and extension of absorption, under fasting conditions, was concluded according to European guidelines. Both formulations were well tolerated. The conclusion of bioequivalence was also supported using the truncated AUCs approach. And Bioavailability is a key indicator of drug absorption. It represents the administered dose fraction which achieves success in reaching the systemic circulation when administered orally or through any other extravascular dosing route.

More Services

We Use Plants to Bring Life