4. Prospective-Retrospective Approaches
847
A prospective-retrospective study with respect to an IVD companion diagnostic is one in
848
which there is a pre-specified plan to prospectively collect specimens and retrospectively
849
analyze outcomes based on the IVD result (which result may be obtained at the time of
850
specimen collection or at a later point) after the clinical trial is completed. The statistical
851
analysis plan should pre-specify a marker-based study objective that identifies the
852
samples that will be collected, the testing that will be conducted based on the samples
853
collected, and how outcomes will be analyzed based on the IVD results.
854
855
By definition, in a prospective-retrospective study, the random assignment of subjects to
856
treatment arms cannot have been stratified by marker status. However, subjects within
857
the marker-based subpopulation were randomly assigned to treatment arms, preserving
858
the validity of treatment comparisons within that marker-based subpopulation.
859
860
Therapeutic product indications are usually based on prospective clinical trials.
861
Therapeutic product claims based on prospective-retrospective studies will generally be
862
accepted only in defined circumstances, and will likely need to be substantiated in more
863
than one adequate, well-controlled study. A prospectively-defined retrospective analysis
864
might be considered acceptable if the following recommendations are followed:
67
865
· Pre-specification of the primary analysis endpoint(s) occurs prior to study
866
unblinding or any unblinded interim analysis.
867
· The banked samples are from an adequate, well-conducted, well-controlled study.
868
· The study is of adequate size such that treatment effects in one or more marker-
869
defined subgroups of interest can be determined.
870
· The test result can be ascertained in a very large proportion of the study subjects.
871
· The IVD has acceptable analytical performance.
872
· The pre-specified retrospective analysis plan is considered acceptable by FDA.
873
· Users of the assay are blinded to the study’s clinical outcomes.
874
875
To use a prospective-retrospective design, knowledge of the prevalence of the marker of
876
interest in the population to be treated is critical to enable a valid analysis, both to assure
877
that enough marker-positive subjects will be enrolled and to assure sufficient
878
67
For further discussion, see transcripts from the December 16, 2008, meeting of FDA’s Oncologic Drugs
Advisory Committee discussing KRAS testing (
http://www.fda.gov/ohrms/dockets/ac/cder08.html
).
Contains Nonbinding Recommendations
Draft - Not for Implementation
26
randomization of marker-positive and -negative subjects to the various treatment arms.
879
880
The statistical analysis plan should include a plan to address robustness (sensitivity) of
881
study conclusions to missing test results. Subjects with and without test results should be
882
compared on the distribution of variables that could affect the assay result, especially
883
variables concerning the characteristics of the sample, its handling, and its processing.
884
Subjects with and without test results may also need to be compared on the distribution of
885
individual characteristics, disease characteristics, and outcome. The impact of missing
886
data on clinical performance (e.g., hazard ratio in marker-defined subset) should be
887
analyzed. To evaluate the sensitivity of clinical performance to missing data, a model
888
may be used to impute missing test results based on the variables described above.
889
Analyses should consider that data may be missing not at random but may
890
disproportionately include subjects with assay results near the cutoff, for example.
891
Analysis based on an incomplete sample of marker data may yield biased results.
892
893
For trials in which subject samples are taken prior to treatment assignment, the
894
probability of having a test result for a subject is independent of treatment assignment.
895
However, for various reasons the distribution of available test results on archived samples
896
may be distorted relative to the distribution in fresh samples (e.g., tumors with larger
897
volume may be overrepresented), which may limit the generalizability of treatment
898
effects observed in retrospective studies of archived samples.
899
900
5. Considerations for Identifying Intended Populations
901
In codevelopment programs, the goal is usually to identify a population expected to benefit
902
from the therapeutic product (or a particular dose) or to avoid serious toxicities caused by the
903
therapeutic product. Therefore, sponsors should pay close attention to the range of analytes
904
and establishing the appropriate assay cutoffs to adequately define this population.
905
906
i. Adequate Representation of Markers in Study Population
907
Selection of appropriate study populations or doses/dosing interval, etc. of the therapeutic
908
product in codevelopment programs may rely on results from an IVD that detects or
909
measures a single marker or detects or measures multiple genetic variants or other markers.
68
910
911
In general, sample size depends on the primary outcome of interest, the magnitude of the
912
treatment effect in the population to be analyzed and the prevalence of the marker in the
913
population to be analyzed. When designing a clinical trial, the most straightforward option is
914
to ensure adequate representation of each marker of potential importance to enable
915
characterization of the efficacy and/or safety across all of the markers within a population.
916
The prevalence of the markers may differ substantially relative to one another, such that it
917
may not always be appropriate to enroll all subjects with a given marker. To assure
918
enrollment of an adequate number of subjects with a low-prevalence marker of interest, a
919
pre-specified enrichment strategy is appropriate. When determining the appropriate study
920
68
Note that multiple markers that are combined to generate a single composite result are generally treated as a
single marker, and thus prevalence of individual markers would not be a concern.
Contains Nonbinding Recommendations
Draft - Not for Implementation
27
population and breadth of marker capture, sponsors may consult with the lead therapeutic
921
product review center for feedback on whether and to what extent marker-negative and rarer-
922
marker subjects should be included. It is also important to include, where applicable,
923
subjects with a range of positivity on the marker to assess the relation of the degree of
924
marker-positivity to outcome and to establish a marker cutoff. If there is insufficient
925
evidence to support the use of certain markers detected by the IVD, the therapeutic product
926
review center will determine whether or how such markers should be included in the
927
therapeutic product labeling. Sponsors should be aware that, regardless of each marker’s
928
prevalence, analytical validation of the IVD for each reported marker may be necessary (see
929
Section III.C.7.).
930
931
ii. Establishing Cutoffs for IVD Companion Diagnostics
932
The cutoff for an IVD companion diagnostic is the test value above (or below) which the
933
clinical decision changes (for example, subjects with test results above the cutoff value are
934
eligible for treatment, whereas those with test results below the cutoff value are not given the
935
treatment). Pre-specified cutoff values are essential for the analysis of use of the IVD in a
936
clinical trial. These may be chosen based on prior data but validating the cutoff is often an
937
important objective of the clinical trial. The cutoff value is intended to represent a point
938
where the sponsor can reliably identify the subjects who are suitable for randomization,
939
choose the appropriate dose, or make other clinical trial decisions. Although the analysis will
940
often be based on the population above the cutoff, results from subjects below the cutoff will
941
also be of interest (e.g., assessment of the appropriateness of the cutoff).
942
An IVD companion diagnostic’s cutoff value should represent a point above (or below)
943
which patients are considered to be positive or negative for the marker(s) of interest. Cutoff
944
values that distinguish relevant trial populations usually should be established for the
945
investigational IVD prior to use in clinical trials intended to be submitted to support a
946
therapeutic product’s approval.
69
947
948
To date, most IVD companion diagnostics have yielded a qualitative result that classifies
949
subjects into two or more groups (e.g., mutation present or absent). Qualitative results often
950
have an underlying quantitative variable that is important for establishing the cutoff between
951
the qualitative classifications. This cutoff may be the limit of detection, the limit of
952
quantitation, or a value that corresponds to a clinically-significant decision point.
953
954
When a test result is quantitative (i.e., yields a continuum of values), consideration should be
955
given to whether additional studies evaluating the dose-response relationship between the
956
marker of interest and the therapeutic product are necessary to refine the cutoff to include a
957
range of marker-positive subjects in the clinical trial, either as distinct randomized groups or
958
as subsets that can be analyzed later, perhaps leading to a formal baseline-response study. If
959
the marker is both prognostic and predictive, it may also be necessary to stratify subjects to
960
treatment arms based on a pre-specified cutoff value.
961
962
69
See note 51.
Contains Nonbinding Recommendations
Draft - Not for Implementation
28
For ordinal values (e.g., immunohistochemistry (IHC) tests scored as 0, 1+, 2+, 3+), pre-
963
specification of categories considered above and below the cutoff is strongly recommended.
964
Although the statistical plan will include a cutoff (e.g., ≥ 2+), results in all categories will be
965
informative.
966
967
If indeterminate (or equivocal) values will be produced, the sponsor should discuss how
968
subjects with such values will be classified for purposes of the clinical trial, and how the
969
indeterminate zone will be used clinically if the therapeutic product and its IVD companion
970
diagnostic receive marketing authorization.
70
The sponsor should also consider other data
971
that would be needed to classify such patients. In light of these complexities, IVD
972
companion diagnostics that provide clear cutoff values are strongly recommended, where
973
available.
974
975
For IVD companion diagnostics, the validity of the test is determined by the ability of the test
976
result to support conclusions made about the treated group when the specified cutoff is used.
977
As with any IVD, changing the cutoff(s) can change the way patients are classified (e.g.,
978
marker-negative or marker-positive). Therefore, it is very important that the cutoff be
979
specified prior to using the test in a clinical trial. In most cases, inclusion of some subjects
980
below the cutoff can be useful to refine the cutoff (e.g., when subjects with values below the
981
cutoff have some likelihood of achieving the treatment effect of the therapeutic product),
982
even if the primary analysis includes only subjects above the cutoff. It is recognized that the
983
optimal cutoff may be unknown before clinical data are available in a reasonable number of
984
subjects. In such cases, another clinical trial confirming the results with the new cutoff, or an
985
adaptive design that allows intra-trial cutoff alterations, would be necessary to ensure that
986
positive results are not due to bias or chance.
987
E. Considerations for IVD Development in Late Therapeutic
988
Product Development
989
For the majority of IVD companion diagnostics for novel therapeutic products, FDA
990
expects that clinical evidence to support use of the IVD companion diagnostic will be
991
generated in the major efficacy trial(s) intended to support approval of the therapeutic
992
product. Therefore, it is important that the investigational IVD(s) used in these trials is
993
completely specified and that analytical validation is complete and meets the therapeutic
994
product sponsor’s expectations for performance.
71
To assure that the analytical validation
995
is well-established and that the IVD can be relied on to supply the correct results, the
996
70
An example of use of an indeterminate cutoff is the 2+ result of the IHC tests for HER-2 overexpression.
Reproducibility studies revealed that readers had a difficult time separating 2+ from 1+ and 3+ results. The
clinical trial confirmed that fewer persons with 2+ results were having positive treatment outcomes than persons
with clear 3+ results, and, as a result, 2+ results were re-categorized as representing indeterminate rather than
positive results. To address the uncertainty of values in this gray zone, a recommendation in the clinical
practice was introduced to have all 2+ results evaluated by re-assay with another type of test. (See Herceptin
(trastuzumab) package insert, available at:
http://www.accessdata.fda.gov/drugsatfda_docs/label/2000/trasgen020900LB.htm
).
71
Note that there may be some circumstances where an alternative approach may be appropriate, such as
prospective adaptive designs or prospective-retrospective trials.
Contains Nonbinding Recommendations
Draft - Not for Implementation
29
elements discussed in the following sections should be considered for relevance to the
997
investigational IVD, and applicable elements should be addressed appropriately in the
998
validation study design.
999
1000
1. Training Samples Sets versus Validation Samples Sets
1001
The set of clinical samples used to design an IVD and establish the clinical decision
1002
point(s) and assay cutoff(s) is referred to as the “training set.” Testing should be
1003
conducted with a second set of independent clinical samples (i.e., the “validation set”)
1004
and with the final IVD design to validate the IVD and determine whether the assay
1005
cutoffs correlate with clinical outcome. For IVD companion diagnostics, the validation
1006
sample set is generally made up of samples from subjects screened for enrollment into the
1007
major efficacy clinical trial(s) that is intended to support efficacy claims for the
1008
therapeutic product. For this reason, IVD design and assay cutoffs should be established
1009
before the IVD is applied to these samples.
1010
1011
If changes are made to the IVD based on results obtained with the clinical samples from
1012
the major efficacy trial(s) (e.g., changing the cutoff to include all those who responded in
1013
the trial), then what would otherwise have been the validation set effectively becomes a
1014
new training set for the modified IVD. The modified IVD likely could not receive
1015
marketing authorization as an IVD companion diagnostic without further studies, as it
1016
will likely not select the same population represented in the major efficacy trial(s). For
1017
this reason, the analytical development of the new IVD should not be conducted with the
1018
specimens needed to clinically validate the assay. While it may seem logical to use the
1019
trial specimens to assure concordance between the two versions of the test, there is no
1020
assurance as to whether the same concordance would be obtained with a different set of
1021
samples. The new IVD design may be established with a set of procured clinical samples
1022
similar to the subjects in the trial or samples from earlier investigational trials.
1023
1024
2. Effect of Changes to the Test Design
1025
In codevelopment programs, the target population for a therapeutic product is selected on
1026
the basis of test results. It is important to ensure that this same population can be
1027
identified after approval of the therapeutic product. When the use of an IVD companion
1028
diagnostic is essential for the safe and effective use of the therapeutic product and its use
1029
is part of the instructions for use of the therapeutic product, FDA recommends that,
1030
whenever possible, the candidate IVD companion diagnostic be validated as part of the
1031
major efficacy trial(s).
1032
1033
Whenever an IVD is changed (e.g., changes in reagent configurations, instruments,
1034
platforms, methods, calibration), the change may generate questions as to whether the
1035
new test would result in the same clinical trial actions as the original test. If a revised
1036
IVD is implemented, generally a bridging study (see Section III.E.3.) would be needed to
1037
demonstrate high concordance between the two IVDs. Note that discordance between the
1038
IVDs with respect to patient enrollment may make interpretation of clinical trial results
1039
difficult or impossible.
1040
Contains Nonbinding Recommendations
Draft - Not for Implementation
30
1041
3. IVD Bridging Studies
1042
If a test other than the candidate IVD companion diagnostic is used for the major efficacy
1043
trial(s), the IVD sponsor should demonstrate that the candidate IVD companion
1044
diagnostic has performance characteristics that are very similar to those of the test that
1045
was used in the trial (sometimes referred to as the clinical trial assay or CTA). This is
1046
generally demonstrated through a bridging study between the two tests, using the original
1047
clinical trial samples and a pre-specified statistical analysis plan, to show that results with
1048
the candidate IVD companion diagnostic are very similar to those with the CTA. A
1049
bridging study evaluates efficacy of the therapeutic product in subjects whose marker
1050
status is determined by the candidate IVD companion diagnostic by assessing both
1051
concordance and discordance between the two tests using the same specimens from
1052
subjects who were tested for trial eligibility. The analysis needs to consider any potential
1053
impact of missing samples not available for the concordance study. The ability of the
1054
candidate IVD companion diagnostic to predict the efficacy of the therapeutic product
1055
can be supported indirectly by high analytical concordance with the CTA on a large
1056
number of representative samples, including samples from subjects excluded from the
1057
trial because they were marker-negative by the CTA. Thus, FDA's assessment of the
1058
clinical validity of the candidate IVD companion diagnostic will rely on extrapolating the
1059
clinical performance characteristics of the CTA to the clinical performance characteristics
1060
of the candidate IVD companion diagnostic.
1061
1062
The ideal bridging study is one in which all samples tested with the trial test are retested
1063
with the candidate IVD companion diagnostic and valid test results are obtained and used
1064
to assess comparative performance.
72
A bridging study with specimens from an all-
1065
comers trial also allows an analysis of efficacy using the results of the candidate IVD
1066
companion diagnostic. Note, however, that care should be taken in understanding the
1067
analytical performance of the IVD prior to the bridging study because adjustments to the
1068
IVD should not be made from results obtained with the clinical trial samples (see Section
1069
III.E.1).
1070
1071
Whether a clinical trial enrolls subjects irrespective of the test result or enrolls only the
1072
subset of subjects identified by the test result, both the test-negative and test-positive
1073
clinical trial samples should be included in bridging studies to avoid bias due to
1074
prescreening (see Section III.C.5.). FDA recognizes, however, that there are many
1075
reasons why all the samples tested with the CTA may not be available for retesting,
1076
including that samples are missing, not accessible, or insufficient in quantity to retest, and
1077
it may not be possible to retest all samples. If only a subset of samples is retested, the
1078
sponsor should ensure that the characteristics of the subset adequately reflect the
1079
characteristics that affect test performance (e.g., tumor size, histology, melanin content,
1080
necrotic tissue, resected tissue versus core needle biopsy) and that the characteristics of
1081
the subjects that may affect therapeutic product efficacy (e.g., patient demographics,
1082
72
See Appendix 2 for a discussion of appropriate specimen handling, which can affect the validity of bridging
studies.
Contains Nonbinding Recommendations
Draft - Not for Implementation
31
stage of disease, stratification factors) are proportionally preserved in the retest sample
1083
set when compared to the samples in the original set. In addressing baseline imbalance
1084
between the retested and non-retested analysis sets, FDA recommends that sponsors
1085
identify any covariates that can affect the test result and then check for baseline
1086
imbalance between the retested and non-retested analysis sets using the set of covariates
1087
identified.
1088
1089
A re-analysis of the primary outcome data should be made according to the final test
1090
results with the retest sample set in order to assure that any reclassification that occurs
1091
does not alter conclusions about the safety and efficacy of the therapeutic product in the
1092
selected population. When all samples are not retested, a second re-analysis can be
1093
conducted in which missing data for the final test are imputed. The nature of the re-
1094
analysis will be product-specific and may be discussed with the appropriate IVD review
1095
center.
1096
1097
Finally, additional analytical validation may be requested to support satisfactory
1098
concordance across methods where discordance may arise, e.g., precision, limit of
1099
detection, and accuracy. In the event there is discordance in a marker-positive-only trial,
1100
it is possible that the candidate IVD companion diagnostic will more accurately predict
1101
responders, a difference that would represent an advantage for optimal use of the
1102
therapeutic product.
1103
1104
Dostları ilə paylaş: |