Bone Health
Immunology
Hematology
Respiratory
Dermatology
Diabetes
Gastroenterology
Neurology
Oncology
Ophthalmology
Rare Disease
Rheumatology

Business Development Legal Policy Practice Regulatory

A Critical Analysis of the FDA Draft Guidance on Development of Therapeutic Protein Biosimilars: Comparative Analytical Assessment and Other Quality-Related Considerations

June 4, 2019

Article

The FDA has issued a long-awaited and highly anticipated guidance on establishing analytical testing of biosimilars to a reference product This article summarizes the essential elements of the guidance, and it identifies pivotal changes and recommended study designs.

Introduction

In May 2019, the FDA issued a long-awaited and highly anticipated guidance on establishing analytical testing of biosimilars to a reference product after it withdrew its guidance, Statistical Approaches to Evaluate Analytical Similarity of biosimilars, in June 2018.

The new draft guidance applies to proposed biosimilars and to other protein products, such as in vivo protein diagnostic products, and is intended to recommend to sponsors the scientific and technical information needed for the chemistry, manufacturing, and controls (CMC) portion of a marketing application for a proposed product submitted under section 351(k) of the Public Health Service (PHS) Act.

This article summarizes the essential elements of the guidance, and it identifies pivotal changes and recommended study designs. Also, provided in this article is a listing of what the author believes to be deficiencies in the guidance that the FDA needs to address in order to enable faster development of biosimilars.

Background

Section 351(k) of the PHS Act (42 USC 262[k]), added by the Biologics Price Competition and Innovation Act (BPCIA), sets forth the requirements for an application for a proposed biosimilar product or a proposed interchangeable product.

An application submitted under section 351(k) must contain, among other things, information demonstrating that the biological product is biosimilar to a reference product based upon data derived from:

Analytical studies that demonstrate that the biological product is highly similar to the reference product notwithstanding minor differences in clinically inactive components
Animal studies (including the assessment of toxicity)
A clinical study or studies, including the assessment of immunogenicity and pharmacokinetics (PK) or pharmacodynamics (PD) that are sufficient to demonstrate safety, purity, and potency in 1 or more appropriate conditions of use for which the reference product is licensed and intended to be used and for which licensure is sought for the biological product

The FDA has the discretion to determine that an element listed above is unnecessary in a 351(k) application, but not in the choice of the reference product, indications, dosage form, active drug form, and current good manufacturing practice compliance.

Analytical testing forms the core of the development of biosimilars, and the FDA has finalized guidance that described the statistical modeling of analytical similarity data—guidance that was universally followed by developers.

However, this guidance contained contradictions and inaccuracies pointed out by the author in the form of 2 citizen petitions, several publications, and testimonies to the FDA. The new draft guidance issued by the FDA replaces the earlier document, without making any reference to the earlier guidance, and includes several recommendations made by the author, as identified in this paper. Other recommendations that were not addressed by the FDA are also highlighted here, as are plausible options and solutions provided to help sponsors expedite the development of their biosimilar products.

In May 2019, the FDA also issued final guidance on the interchangeability status of biosimilars that included several recommendations made by the author. The FDA is expected to issue several additional guidance documents, as committed in its Biosimilars Action Plan, to make the development of biosimilars more rational without compromising their safety and efficacy.

Similarity Assessment

It is worthwhile to note that the FDA now uses the term “analytical assessment,” which has a much broader scope than “analytical similarity” (as used in the withdrawn guidance), and emphasizes that even in those situations in which a biosimilar product does not fully match elements of biosimilarity, the sponsor can make a point to demonstrate why these differences are not critical to the safety and efficacy of the biosimilar product. This is a major shift in the thinking of the FDA; even though this option was available before, now the FDA clearly encourages sponsors to take a more scientific approach.

If differences between products are observed as part of the comparative analytical assessment (including the components of the assessment that were not included in the risk ranking), the sponsor may provide additional scientific information (a risk assessment and additional data) and a justification for why these differences do not preclude a demonstration that the products are highly similar.

In certain situations, changes to the manufacturing process of the biosimilar product may be needed to resolve differences observed in the comparative analytical assessment. Data should be provided to demonstrate that the observed differences were resolved by any manufacturing changes and that other quality attributes were not substantially affected. If other attributes were affected by the manufacturing change, data should be provided to demonstrate that the impact of the change has been evaluated and addressed.

Reference Product Attributes

The first step in an analytical assessment is a determination of the quality attributes that characterize the reference product in terms of its structural/physicochemical and functional properties.

These quality attributes are then ranked according to their risk to potentially impact activity, PK/PD, safety, efficacy, and immunogenicity. Finally, the attributes are evaluated using quantitative analysis, considering the risk ranking of the quality attributes, as well as other factors. It should be noted, however, that some attributes may be highly critical (eg, primary sequence) but not amenable to quantitative analysis.

The FDA recommends that sponsors develop a risk assessment tool to evaluate and rank the reference product’s quality attributes in terms of the potential impact on the mechanism or mechanisms of action and the function of the product. Certain quality evaluations of the reference product (eg, its degradation rates, which are determined from stability or forced degradation studies) generally should not be included in the risk ranking. However, these evaluations should still factor in the comparative analytical assessment of the proposed biosimilar and the reference product.

Development of the risk assessment tool should be informed by relevant factors, including:

The potential impact of an attribute on clinical performance. Specifically, the FDA recommends that sponsors consider the potential impact of an attribute on activity, PK/PD, safety, efficacy, and immunogenicity. Sponsors should consider publicly available information, as well as the sponsor’s own characterization of the reference product, in determining the potential impact of an attribute on clinical performance.

The degree of uncertainty surrounding a certain quality attribute. For example, when there is a limited understanding of the relationship between the degree of change in an attribute and the resulting clinical impact, the FDA recommends that that attribute be ranked as having a higher risk because of the uncertainty raised.

The FDA recommends that an attribute that is high risk for any one of the performance categories (ie, activity, PK/PD, safety, efficacy, or immunogenicity) be classified as high risk. Ideally, the risk assessment tool should result in a list of attributes ordered by the risk to the patient. The risk scores for attributes should, therefore, be proportional to patient risk. The scoring criteria used in the risk assessment should be clearly defined and justified, and the risk ranking for each attribute should be justified with appropriate citations to the literature and data provided.

Protocols for Testing

The FDA identifies 4 stages at which the sponsor is expected to submit analytical data for review by the FDA:

Early in the development process in a Biosimilars Initial Advisory meeting
At the pre-investigational new drug (IND) stage in a Type 2 Meeting
With the original IND submission
With the submission of data from the initial clinical studies, such as PK and PD studies, in a Type 2 Meeting

Manufacturing Changes

If there is a manufacturing process change during development, it may be possible, with an adequate scientific justification, to use data generated from lots manufactured with a different process. However, data should be provided in the 351(k) Biologics License Application (BLA) to support comparability of drug substance and drug product manufactured with the different processes and/or scales.

A sponsor considering manufacturing changes after completing the initial comparative analytical assessment or after completing clinical studies intended to support a 351(k) application will need to demonstrate comparability between the pre- and post-change proposed product, and may need to conduct additional studies. The nature and extent of the changes may determine the extent of these additional studies. The comparative analytical studies should include a sufficient number of lots of the proposed biosimilar product used in clinical studies as well as from the proposed commercial process if the process used to produce the material used in the clinical studies is different.

However, the option of using the International Council on Harmonisation of Technical Requirements for Pharmaceuticals for Human Use’s (ICH’s) comparability protocol remains with the sponsor to make changes post-licensing. The main question is whether the FDA will allow smaller lots to be approved as clinical lots.

Expression System

Minor modifications, such as N- or C- terminal truncations (eg, the heterogeneity of C-terminal lysine of a monoclonal antibody) that are not expected to change the product’s performance may be justified and should be explained by the sponsor. Possible differences between the chosen expression system (ie, host cell and the expression construct) of the proposed product and that of the reference product should be carefully considered because the type of expression system will affect the types of process- and product-related substances, impurities, and contaminants (including potential adventitious agents) that may be present in the protein product. For example, the expression system can have a significant effect on the types and extent of translational and posttranslational modifications that are imparted to the proposed product, which may introduce additional uncertainty into the demonstration that the proposed product is biosimilar to the reference product.

The FDA is gently reminding developers not to try out any novel expression systems and stick to the expression system used by the reference product.

Impurities

If the manufacturing process used to produce the proposed product introduces different impurities or higher levels of impurities than those present in the reference product, additional pharmacological, toxicological, or other studies may be necessary. The FDA recommends removing impurity variations, rather than relying on proving the safety of impurities, in preclinical programs.

The process-related impurities in the proposed product are not expected to match those observed in the reference product and are not included in the comparative analytical assessment. The chosen analytical procedures should be adequate to detect, identify, and accurately quantify biologically significant levels of these impurities. In particular, results of immunological methods used to detect host cell proteins depend on the assay reagents and the cell substrate used.

Such assays should be validated using the product cell substrate and orthogonal methodologies to ensure accuracy and sensitivity. The safety of the proposed product with regard to adventitious agents or endogenous viral contamination should be ensured by screening critical raw materials and confirmation of robust virus removal and inactivation achieved by the manufacturing process.

Analytical Methodology

The FDA now recognizes that, despite improvements in analytical techniques, current analytical methodology is not able to detect or characterize all relevant structural and functional differences between 2 protein products; this forms the basis of what the FDA now calls “assessment,” which may include testing of multiple attributes with several methods when comparing a proposed biosimilar to the reference product. Comprehensive physicochemical and functional studies may include the following:

biological assays
binding assays, and enzyme kinetics
the molecular weight of the protein
the complexity of the protein (higher order structure and post-translational modifications)
degree of heterogeneity
functional properties
impurity profiles and degradation profiles denoting stability

A full characterization of the reference product, in addition to consideration of publicly available information, will form the basis of understanding the observed lot-to-lot variability derived from manufacturing conditions and from analytical assay variability. Factors that contribute to lot-to-lot variability in the manufacturing of a protein product include the source of certain raw materials (eg, growth medium, resins, or separation materials) and different manufacturing sites.

Validation

The FDA has further made it clear that, unlike routine quality control assays, tests used to characterize the product do not necessarily need to be validated; however, the tests used to characterize the product should be scientifically sound, fit for their intended use, and able to provide results that are reproducible and reliable.

The methods should be demonstrated to be of appropriate sensitivity and specificity to provide meaningful information as to whether the proposed product and the reference product are highly similar. The reason for not requiring validation comes from the testing protocols in which the 2 products are tested side-by-side, serving to control to each other. In the case of release testing, there is no such control, requiring the methods to be validated.

Orthogonal Methods

The FDA encourages the development of orthogonal quantitative methods to definitively identify any differences in product attributes. Based on the results of analytical studies assessing functional and physicochemical characteristics, including, for example, higher-order structure, post-translational modifications, and impurity and degradation profiles, the sponsor may have an appropriate scientific basis for a selective and targeted approach to subsequent animal and/or clinical studies to support a demonstration of biosimilarity.

It is advisable to apply more than 1 analytical procedure to evaluate the same quality attribute. Methods that use different physicochemical or biological principles to assess the same attribute are especially valuable because they provide independent data to support the quality of that attribute (eg, orthogonal methods to assess aggregation).

In addition, the use of complementary analytical techniques in a series, such as peptide mapping or capillary electrophoresis combined with mass spectrometry of the separated molecules, should provide a meaningful and sensitive method for comparing products.

Fingerprint-Like Algorithm

It may be useful to compare differences in the quality attributes of the proposed product with those of the reference product using a meaningful, fingerprint-like analysis algorithm¹ that covers a large number of additional product attributes and their combinations with high sensitivity using orthogonal methods. Enhanced approaches in manufacturing science, as discussed in ICH Q8(R2), may facilitate production processes that can better match a reference product’s fingerprint.

Such a strategy could further quantify the overall similarity between 2 molecules and may lead to additional bases for a more selective and targeted approach to subsequent animal and/or clinical studies.

Bioassays

Multiple functional assays should, in general, be performed as part of the comparative analytical assessments. If a reference product exhibits multiple functional activities, sponsors should perform a set of appropriate assays designed to evaluate the range of relevant activities for that product.

For example, with proteins that possess multiple functional domains expressing enzymatic and receptor-mediated activities, sponsors should evaluate both activities to the extent that these activities are relevant to the product’s performance. For products for which functional activity can be measured by more than 1 parameter (eg, enzyme kinetics or interactions with blood clotting factors), the comparative characterization of each parameter between products should be assessed.

In vitro bioactivity assays may not fully reflect the clinical activity of the protein. For example, these assays generally do not predict the bioavailability (PK and biodistribution) of the product, which can affect PD and clinical performance. Also, bioavailability can be dramatically altered by subtle differences in glycoform distribution or other posttranslational modifications.

Thus, these limitations should be taken into account when assessing the robustness of the quality of data supporting biosimilarity and the need for additional information that may address residual uncertainties. Finally, functional assays are important in assessing the occurrence of neutralizing antibodies in nonclinical and clinical studies.

When binding is part of the activity attributed to the protein product, analytical tests should be performed to characterize the proposed product in terms of its specific binding properties (eg, if binding to a receptor is inherent to protein function, this property should be measured and used in comparative studies). Various methods, such as surface plasmon resonance, microcalorimetry, or classical Scatchard analysis can provide information on the kinetics and thermodynamics of binding. Such information can be related to the functional activity and characterization of the proposed product's higher-order structure.

Testing Lots

In the withdrawn guidance, the FDA had presented a complex 3-tier system, with tier 1 involving an equivalence interval, tier 2 involving equivalence range, and tier 3 involving physical matching of results.

The first 2 types of statistical modeling required selection of a minimum number of lots to satisfy the statistical criteria. The FDA continues to suggest that evaluation of multiple lots of a reference product and multiple lots of a proposed product are required to enable estimation of product variability across lots, and it continues in its recommendation to use at least 10 reference product lots (acquired over a time frame that spans expiration dates of several years), in the analytical assessment to ensure that the variability of the reference product is captured adequately.

The final number of lots should be sufficient to provide adequate information regarding the variability of the reference product. In cases in which limited numbers of reference product lots are available (eg, for certain orphan drugs), alternate, flexible comparative analytical assessments plans should be proposed and discussed with the FDA.

Biosimilar Lots

The FDA recommends that a sponsor include at least 6 to 10 lots of the proposed product in the comparative analytical assessment, and these should include lots manufactured with the investigational- and commercial-scale processes.

They may include validation lots as well as product lots manufactured at different scales, including engineering lots. These lots should be representative of the intended commercial manufacturing process. To the extent possible, proposed biosimilar lots included in the comparative analytical assessment should be derived from different drug substance batches to adequately represent the variability of attributes inherent to the drug substance manufacturing process.

Extracted Drug Substance

If the drug substance has been extracted from the reference product to conduct analytical studies, the sponsor should describe the extraction procedure and provide support to show that the procedure itself does not alter relevant product quality attributes.

This undertaking would include consideration of an alteration or loss of the desired products and impurities and relevant product-related substances, and it should include appropriate controls to ensure that relevant characteristics of the protein are not significantly altered by the extraction procedure.

Lot Identification

Identification of specific lots of a reference product used in comparative analytical studies, together with expiration dates and time frames and when the lots were analyzed and used in other types of nonclinical and clinical studies, should be provided. This information will be useful in justifying acceptance criteria to ensure product consistency, as well as to support the comparative analytical assessment of the proposed product and the reference product.

Attribute Assessment

For all methods in which the result is reported relative to the reference standard, the assignment of the potency of 100% should include a narrow acceptable potency range and should ensure control over product drift.

For example, a sponsor should consider the use of a predetermined 2-sided confidence interval (CI) of the mean of the replicates, where the mean relative potency and the 95% CI are included within a sufficiently narrow range (eg, 90%-110%). There should be an evaluation across the history of multiple reference standard qualifications to address potential drift. A sponsor generally should not use a correction factor to account for any differences in, for example, potency or biological activity between reference standards.

The Totality of Analytical Data

Acceptance criteria should be based on the totality of the analytical data and not simply on the observed range of product attributes of the reference product. This is because some product attributes act in combination to affect a product’s safety, purity, and potency profile; therefore, their potential interaction should be considered when conducting the comparative analytical assessment and setting specifications.

For example, for some glycoproteins, the content and distribution of tetra-antennary and N-acetyl lactosamine repeats can affect in vivo potency, and they should be evaluated together. The FDA emphasizes the confirming the relationship between an attribute and the performance of the drug product (see ICH Q8[R2]) to help establish acceptance criteria.

Accountability

Sponsors should account for all reference product lots acquired and characterized. The 351(k) BLA should include data and information from all reference product and proposed product lots that were evaluated in any manner, including the specific physicochemical, functional, animal, and clinical studies for which a lot was used.

When a lot is specifically selected for inclusion or exclusion from certain analytical studies, a justification should be provided. The date of the analytical testing, as well as the product expiration date, should be provided in the application.

In general, expired reference product lots should not be included in the comparative analytical assessment because lots analyzed beyond their expiration date could lead to results outside the range that would normally be observed in unexpired lots, which may result in overestimated reference product variability. Testing of lots past expiry may be acceptable if samples are stored under long term conditions (eg, frozen at —80°C) provided that sponsors submit data and information demonstrating that storage does not impact the quality of the product.

The same type of information and data described above to be collected for reference product lots should also be provided on every manufactured drug substance and drug product lot of the proposed product.

Reference product and proposed product lots used in the clinical studies (eg, applicable PK and PD, similarity studies, and comparative clinical studies) should be included in the comparative analytical assessment.

Reference Standard

If there is a suitable, publicly available, and well-established reference standard for the protein, a physicochemical and/or functional comparison of the proposed product with this standard may also provide useful information. However, while studies with such a reference standard may be useful, they are not sufficient to satisfy the BPCIA’s requirement to demonstrate the biosimilarity of the proposed product to the US-licensed reference product.

Once clinical lots of the proposed product have been manufactured, it is expected that 1 of these lots will be properly qualified (including bridging to previous reference standards) for use as a reference standard for release and stability, as well as comparative analytical testing. If possible, once an in-house reference standard is properly qualified, there should be sufficient quantities to use throughout the development of the proposed product. All lots of reference standards used during the development of a proposed product should be properly qualified. In addition to release testing methods, the qualification protocol for reference standards should include all analytical methods that report the result relative to the reference standard.

Non—US-Licensed Comparator Products

A sponsor intending to use a non—US-licensed comparator in certain studies should provide comparative analytical data and analysis for all pairwise comparisons (ie, US-licensed product versus proposed biosimilar product, non–US-licensed comparator product versus proposed biosimilar product, and US-licensed product versus non–US-licensed comparator product).

Combining data from the reference product and the non—US-licensed comparator product to determine acceptance criteria or to perform the comparative analytical assessment to the proposed product would not be acceptable to support a demonstration of the proposed product’s biosimilarity to the reference product.

For example, combining data from the reference product and non—US-licensed products may result in a larger range and broader similarity acceptance criteria than would be obtained by relying solely on data from reference product lots.

Drug Product Lots

Characterization studies of a proposed product should be performed on the most downstream intermediate best suited for the analytical procedures used. Whenever possible, if the finished drug product is best suited for a particular analysis, the sponsors should analyze the finished drug product. If an analytical method more sensitively detects specific attributes in the drug substance, but the attributes it measures are critical and/or may change during the manufacture of the finished drug product, comparative characterization may be called for on both the extracted protein and the finished drug product.

The BPCIA allows the use of different inactive ingredients; however, different excipients in the proposed product should be supported by existing toxicology data for the excipient or by additional toxicity studies with the formulation of the proposed product. Excipient interactions, as well as direct toxicities, should be considered.

Data Analysis

The new guidance removes the tier 1 testing and keeps the tier 2 and tier 3 testing, though without labeling them as such. One approach to data analysis is the use of descriptive quality ranges for assessing quantitative quality attributes of high and moderate risk, and the use of raw data/graphical comparisons for quality attributes with the lowest risk ranking or for those quality attributes that cannot be quantitatively measured (eg, primary sequence).

The acceptance criteria for the quality ranges (QR) method in the comparative analytical assessment should be based on the results of the sponsor’s own analysis of the reference product for a specific quality attribute. The QR should be defined as a range calculated by adding to and subtracting from the reference mean the value of standard deviation of the reference product multiplied by a factor, X.

The multiplier (X) should be scientifically justified for that attribute and discussed with the FDA.

Based on experience to date, methods such as tolerance intervals are not recommended for establishing the similarity acceptance criteria because a very large number of lots would be required in order to establish meaningful intervals. The sponsor can propose other methods of data analysis, including equivalence testing.

The objective of the comparative analytical assessment is to verify that each attribute, as observed in the proposed biosimilar and the reference product, has a similar population mean and similar population standard deviation. Comparative analysis of a quality attribute would generally support a finding that the proposed product is highly similar to the reference product when a sufficient percentage of biosimilar lot values (eg, 90%) fall within the QR defined for that attribute (previously labeled as tier 2).

The FDA recommends that narrower acceptance criteria of the QR method in the comparative analytical assessment (eg, a lower X value) be applied to higher-risk quality attributes.

In addition to risk ranking, other factors should be considered in determining which type of quantitative data analysis should be applied to a particular attribute or assay. Some additional factors that should be considered when determining the appropriate type of data evaluation and analysis of results include nature, distribution, abundance of attribute, sensitivity and type of assay.

Qualitative analyses of lower-risk attributes will include a side-by-side data presentation (eg, spectra, thermograms, and graphical representation of data) to allow for a visual comparison of the proposed product to the reference product (previously labeled as tier 3).

Summary

The new analytical assessment guidance, intended to replace the withdrawn guidance, provides clarification of several practices, yet leaves out many specifics for the sponsors to interpret.

Given below are the key findings, and a listing of how the FDA’s guidance aligns with my citizen petitions, publications, and testimonies that included recommendations and inquiries to the FDA.

Conditions when a non-US comparator can be used; in the past, it was generally assumed that a PK/PD study would be required; now it is entirely based on analytical similarity, a logical and scientific approach. This recommendation was included in my citizen petition.

The FDA provided clarification that testing methods for analytical assessment need not be validated where the testing is conducted in a pair-wise manner; it does not apply to release testing. This recommendation was included in my citizen petition.

The lots tested should represent commercial scale and any changes made during development will require analytical testing, at the least. The FDA should have defined how a “commercial scale” is determined, and why the biosimilar sponsor would not be allowed to scale-up by using the comparability protocol after approval. This clarification would have allowed many smaller companies to enter the biosimilar market and reduce the cost of development, all without risking the safety and efficacy of the product. This recommendation was included in my citizen petition, but it was not clarified.

The tier system of statistical testing has been removed, leaving the only physical comparison of data and the use of an equivalence range based on the attribute variability in the reference product; how much of a variation is allowed is now left to sponsors to justify. The FDA continues to emphasize the importance of keeping the range as small as possible, so I do not see sponsors using any factor multiplier larger than 3, as before. There is still a large possibility that a biosimilar product may fail a test if the reference product has extremely narrow variability; this failure will of little value if the attribute can vary within a reasonable range without affecting safety or efficacy. My major objection to the earlier FDA guidance related to the use of “equivalence interval,” an exercise that requires establishing an acceptable difference without any proven basis. This recommendation was included in my citizen petition.

Selection of analytical assessment attributes is left to the sponsors with guidance on selecting the attributes; the FDA should have stated that the release attributes need not be made part of the analytical assessment, on the basis that if it is good for patients then it is an appropriate choice. Much analytical assessment testing can be eliminated if a suitable variability is established based on common wisdom, industry practice and understanding the limitation of manufacturing operations. The exclusions to comparative testing may include protein content, bioassay, visible and subvisible particles, total identified impurities, common physical attributes, and other attributes for which a justified range of variability is established. This recommendation was included in my citizen petition, but it was not clarified.

The FDA continues to require that the developer use at least 10 lots of the reference product and 6-10 lots of the biosimilar product to conduct analytical similarity testing; the reference lots selected should also represent the entire expiry interval, making it very difficult to secure such type and number of lots, disregarding the cost of acquisition. I had suggested to the FDA to divide the attributes into 2 groups: 1 to characterize the molecular structure, an exercise that can be conducted on the drug substance without needing a large number of test lots, since these attributes are not expected to vary much and the biosimilar candidate must match all. The second group would relate to attributes that are process-dependent, including post-translational modifications, and capable of affecting safety and potency. These attributes should be made part of release specification, to assure safety and potency. Testing for structural and functional attributes is best conducted in a pair-wise manner with a reference product that serves as a reference standard, and multiple lots are required because the testing methods need not be validated. None should require equivalence testing. While structural elements can affect safety and efficacy, the functional assays relate mainly to efficacy. This testing should not require more than 3 lots.

Establishing release specifications should be based on scientific wisdom correlating safety and potency to various attributes. Acceptances ranges can be assigned to protein content (±3%), bioassay (±15%), visible and subvisible particles (United States Pharmacopeia limits), common physical attributes (±x%, variable for each attribute; usually ±10% but as high as ±50% for surfactants), aggregates (N-myristoyltransferase [NMT], ±10% of reference), post-translational modifications (±10% of reference for each, and none unidentified), and impurities (NMT 3%, no single more than 1%, none unidentified). Assigning these ranges is no different from equivalence testing. The sponsor may suggest alternate ranges, with justification, that may involve testing multiple lots of reference product. This recommendation was included in my citizen petition, but it was not clarified.

The FDA admits that despite the advancements in analytical technology, testing methods may still come short in defining differences between products and recommends using multiple and orthogonal methods to test critical attributes. The FDA did not provide any clarification on acceptance criteria for cases in which a product attribute is tested through multiple tests. Ie, how would it be determined whether one test is more appropriate than the other? Would a sponsor be motivated to use only the test methods that are less likely to fail? The FDA should establish testing templates based on product types and require an orthogonal test only if there is a reason to prefer one test method over another. This recommendation was included in my citizen petition, but it was not clarified.

The FDA also admits that in vitro bioassays, functional or binding, do not always predict in vivo performance, including PK/PD, immunogenicity, and efficacy. The value of in vitro testing can be established on the basis that the 2 products are tested in a pair-wise manner and the purpose of testing is not to predict any in vivo performance but to compare the potential for in vivo performance that can be fulfilled from the in vitro methods. The value of in vitro methods comes from the ease with which they can be conducted, allowing testing at multiple stages of development. The FDA should encourage development and use of more in vitro testing. This recommendation was included in my citizen petition, but it was not clarified.

The FDA has suggested that a biosimilar product need not always match the attributes of the reference product; the sponsors may provide data to justify any differences, based on scientific arguments, and where applicable, through additional studies. The FDA has also pointed out that manufacturing process-related impurities can be different, and these should be justified by testing their safety, but preferably by modifying the manufacturing process to change the profile. The FDA should have indicated that process-related attributes that can add to immunogenicity, and it should provide suggestions on minimizing the introduction of known variables. Immunogenicity is best controlled through release specification since testing for immunogenicity in 1 lot does not assure the continued safety of the product. This recommendation was included in my citizen petition, but it was not fully clarified.

The FDA has laid out the testing that may be required if the sponsor plans to use an expression system other than the system used by the reference product, advising against this choice. This recommendation was included in my citizen petition.

The FDA did not provide any clarification on the selection of the reference product should that product re-enters intellectual property protection due to changes in concentration or due to changes in the route of administration—a common practice now—as the patents expire for biological drugs. It is noteworthy that the BPCIA requires the biosimilar product to have the same route of administration. This recommendation was included in my citizen petition, but it was not clarified.

The FDA will be issuing several new guidance documents under the Biosimilars Action Plan with the purpose of simplifying the development of biosimilars; the new guidance on analytical assessment left many open-ended considerations, and sponsors are encouraged to provide their views to FDA as comments on this guidance.

Reference

1. Kozlowski S, J Woodcock, K Midthun, RB Sherman. Developing the Nation's Biosimilars Program. N Engl J Med. 2011;365:385-388. doi: 10.1056/NEJMp1107285.