The CAS Loss Reserving Database¶
The CAS Loss Reserving Database is a collection of loss data originally compiled by Glenn Meyers intended to be used for claims reserving studies. The data includes major personal and commercial lines of business from U.S. property casualty insurers. The claims data comes from Schedule P Analysis of Losses and Loss Expenses in the National Association of Insurance Commissioners (NAIC) database. The final product is a dataset that contains run-off triangles of six lines of business for all U.S. property casualty insurers. The triangle data correspond to claims of accident year 1988-1997 with 10 years development lag. Both upper and lower triangles are included so that one could use the data to develop a model and then test its performance retrospectively 1.
- loss_key
 Currently limited to
"com_auto", but will be expanded in a future release (see mapping below)- grcode
 NAIC company code (including insurer groups and single insurers)
- grname
 NAIC company name (including insurer groups and single insurers)
- origin
 Accident year
- dev
 Development year
- incrd_loss
 Incurred losses and allocated expenses reported at year end
- paid_loss
 paid losses and allocated expenses at year end
- ep_dir
 Premiums earned at incurral year - direct and assumed
- ep_ceded
 Premiums earned at incurral year - ceded
- ep_net
 earned_prem_dir - earned_prem_ceded
- single
 1 indicates a single entity, 0 indicates a group insurer
- posted_reserve_97
 Posted reserves in year 1997 taken from the Underwriting and Investment Exhibit Part 2A, including net losses unpaid and unpaid loss adjustment expenses
- train_ind
 A 1 indicates that the value associated with a particular origin-dev combination would fall in the upper-left of a typical loss triangle. * indicates what would typically be indentified as a future observation at the time triangles are constructed. The lower right triangle values can be treated as a holdout set to test the adequacy of a reserve estimator.
The following table provides a description of the type of losses associated
with each unique combination of loss_key and
loss_type_suffix:
loss_type_suffix  | 
loss_key  | 
description  | 
|---|---|---|
D  | 
wc  | 
Workers Compensation  | 
R  | 
prod_liab  | 
Products Liability - Occurance  | 
B  | 
pp_auto  | 
Private Passenger Auto Liability/Medical  | 
H  | 
othr_liab  | 
Other Liability - Occurrence  | 
F  | 
med_mal  | 
Medical Malpractice - Claims Made  | 
C  | 
comauto  | 
Commercial Auto Liability/Medical  | 
A copy of the CAS Loss Reserving Database is distributed along with trikit, in addition to a number of convenience functions intended to simplify database interaction. In what follows, CAS Loss Reserving Database convenience functions are detailed.
Loss Reserve Database API¶
- 
load_lrdb(tri_type=None, loss_type='incurred', lob='comauto', grcode=1767, grname=None, train_only=True)¶ Load the CAS Loss Reserving Database subset of losses. Additional keyword arguments are used to subset the CAS Loss Reserving Database to the records of interest. Within the Loss Reserving Database, “lob” and “grcode” uniquely partition losses into 100 record blocks if
lower_right_ind=True, otherwise losses are partitioned into 55 record blocks. All available combinations of “lob” and “grcode” (referred to as “specs”) can be obtained by callingget_lrdb_specs. Note that whentri_typeis “cum” or “incr”, the remaining subset of records after applyinglob,grcodeandgrnamefilters will be aggregated into a single triangle of the specified type.- Parameters
 tri_type (str) – If
None, lrdb subset is returned as pd.DataFrame. Otherwise, return subset as either incremental or cumulative triangle type. Default value is None.lob (str) – Specifies the line of business to return. Available options are
['comauto', 'ppauto', 'wkcomp', 'medmal', 'prodliab', 'othliab']. Default value is “comauto”.grcode (int) – NAIC company code including insurer groups and single insurers. For a full listing, call
get_lrdb_specs. Default value is1767.grname (str) – NAIC company name (including insurer groups and single insurers). The complete mapping of available grcodes can be obtained by calling
get_lrdb_specs. Default value is None.loss_type (str) – Specifies which loss data to load. Can be one of “paid” or “incurred”. Defaults to “incurred”. Note that bulk losses have already been subtracted from schedule P incurred losses. Default value is “incurred”.
train_only (bool) – If True, the upper-left portion of the triangle will be returned. The upper-left portion of the triangle typically consists of actual loss experience. If False, the squared triangle, consisting of 100 observations is returned. Default value is True.
- Returns
 Either pd.DataFrame, trikit.triangle.IncrTriangle or trikit.triangle.CumTriangle.
CAS Loss Reserving Database datasets are loaded via load_lrdb. A number of additional keyword arguments can be
passed to load_lrdb. For example, to retrieve the subset of commercial auto losses for
grcode=1767:
In [1]: from trikit import load_lrdb
In [2]: df = load_lrdb(grcode=1767, lob="comauto")
In [3]: type(df)
Out[3]: pandas.core.frame.DataFrame
Alternatively, the resulting subset can be transformed to a triangle of the desired type:
In [4]: tri = load_lrdb(tri_type="cum", grcode=1767, lob="comauto")
In [5]: tri
Out[5]:
          1       2       3       4       5         6         7         8         9         10
1988 110,231 263,079 431,216 611,278 797,428   985,570 1,174,922 1,366,229 1,558,096 1,752,096
1989 121,678 279,896 456,640 644,767 837,733 1,033,837 1,233,015 1,432,670 1,633,619       nan
1990 123,376 298,615 500,570 714,683 934,671 1,157,979 1,383,820 1,610,193       nan       nan
1991 117,457 280,058 463,396 662,003 865,401 1,071,271 1,278,228       nan       nan       nan
1992 124,611 291,399 481,170 682,203 889,029 1,101,390       nan       nan       nan       nan
1993 137,902 323,854 533,211 753,639 980,180       nan       nan       nan       nan       nan
1994 150,582 345,110 561,315 792,392     nan       nan       nan       nan       nan       nan
1995 150,511 345,241 560,278     nan     nan       nan       nan       nan       nan       nan
1996 142,301 326,584     nan     nan     nan       nan       nan       nan       nan       nan
1997 143,970     nan     nan     nan     nan       nan       nan       nan       nan       nan
Notice that with grcode and lob specified as above, the returned DataFrame contains 55 records
as expected (recall that by default, train_ind is set to False, otherwise the shape of the returned
DataFrame would be (100, 3).
CAS Loss Reserving Database lines of business can be listed by calling get_lrdb_lobs:
In [1]: from trikit import get_lrdb_lobs
In [2]: get_lrdb_lobs()
Out[2]: ['comauto', 'ppauto', 'wkcomp', 'medmal', 'prodliab', 'othliab']
Unique combinations of “loss_key”, “grname” and “grcode” can be listed by calling get_lrdb_specs:
In [1]: from trikit import get_lrdb_specs
In [2]: get_lrdb_specs()
Out[2]:
    loss_key  grcode                               grname
0    comauto     266              Public Underwriters Grp
1    comauto     337                   California Cas Grp
2    comauto     353                       Celina Mut Grp
3    comauto     388                   Federal Ins Co Grp
4    comauto     460                      Buckeye Ins Grp
..       ...     ...                                  ...
431   wkcomp   41580                    Red Shield Ins Co
432   wkcomp   42439                Toa-Re Ins Co Of Amer
433   wkcomp   43915                       Rainier Ins Co
434   wkcomp   44091  Dowa Fire & Marine Ins Co Ltd Us Br
435   wkcomp   44300                   Tower Ins Co Of NY
Footnotes