The CAS Loss Reserving Database¶
The CAS Loss Reserving Database is a collection of loss data originally compiled by Glenn Meyers intended to be used for claims reserving studies. The data includes major personal and commercial lines of business from U.S. property casualty insurers. The claims data comes from Schedule P Analysis of Losses and Loss Expenses in the National Association of Insurance Commissioners (NAIC) database. The final product is a dataset that contains run-off triangles of six lines of business for all U.S. property casualty insurers. The triangle data correspond to claims of accident year 1988-1997 with 10 years development lag. Both upper and lower triangles are included so that one could use the data to develop a model and then test its performance retrospectively 1.
- loss_key
Currently limited to
"com_auto"
, but will be expanded in a future release (see mapping below)- grcode
NAIC company code (including insurer groups and single insurers)
- grname
NAIC company name (including insurer groups and single insurers)
- origin
Accident year
- dev
Development year
- incrd_loss
Incurred losses and allocated expenses reported at year end
- paid_loss
paid losses and allocated expenses at year end
- ep_dir
Premiums earned at incurral year - direct and assumed
- ep_ceded
Premiums earned at incurral year - ceded
- ep_net
earned_prem_dir - earned_prem_ceded
- single
1 indicates a single entity, 0 indicates a group insurer
- posted_reserve_97
Posted reserves in year 1997 taken from the Underwriting and Investment Exhibit Part 2A, including net losses unpaid and unpaid loss adjustment expenses
- train_ind
A 1 indicates that the value associated with a particular origin-dev combination would fall in the upper-left of a typical loss triangle. * indicates what would typically be indentified as a future observation at the time triangles are constructed. The lower right triangle values can be treated as a holdout set to test the adequacy of a reserve estimator.
The following table provides a description of the type of losses associated
with each unique combination of loss_key
and
loss_type_suffix
:
loss_type_suffix |
loss_key |
description |
---|---|---|
D |
wc |
Workers Compensation |
R |
prod_liab |
Products Liability - Occurance |
B |
pp_auto |
Private Passenger Auto Liability/Medical |
H |
othr_liab |
Other Liability - Occurrence |
F |
med_mal |
Medical Malpractice - Claims Made |
C |
comauto |
Commercial Auto Liability/Medical |
A copy of the CAS Loss Reserving Database is distributed along with trikit, in addition to a number of convenience functions intended to simplify database interaction. In what follows, CAS Loss Reserving Database convenience functions are detailed.
Loss Reserve Database API¶
-
load_lrdb
(tri_type=None, loss_type='incurred', lob='comauto', grcode=1767, grname=None, train_only=True)¶ Load the CAS Loss Reserving Database subset of losses. Additional keyword arguments are used to subset the CAS Loss Reserving Database to the records of interest. Within the Loss Reserving Database, “lob” and “grcode” uniquely partition losses into 100 record blocks if
lower_right_ind=True
, otherwise losses are partitioned into 55 record blocks. All available combinations of “lob” and “grcode” (referred to as “specs”) can be obtained by callingget_lrdb_specs
. Note that whentri_type
is “cum” or “incr”, the remaining subset of records after applyinglob
,grcode
andgrname
filters will be aggregated into a single triangle of the specified type.- Parameters
tri_type (str) – If
None
, lrdb subset is returned as pd.DataFrame. Otherwise, return subset as either incremental or cumulative triangle type. Default value is None.lob (str) – Specifies the line of business to return. Available options are
['comauto', 'ppauto', 'wkcomp', 'medmal', 'prodliab', 'othliab']
. Default value is “comauto”.grcode (int) – NAIC company code including insurer groups and single insurers. For a full listing, call
get_lrdb_specs
. Default value is1767
.grname (str) – NAIC company name (including insurer groups and single insurers). The complete mapping of available grcodes can be obtained by calling
get_lrdb_specs
. Default value is None.loss_type (str) – Specifies which loss data to load. Can be one of “paid” or “incurred”. Defaults to “incurred”. Note that bulk losses have already been subtracted from schedule P incurred losses. Default value is “incurred”.
train_only (bool) – If True, the upper-left portion of the triangle will be returned. The upper-left portion of the triangle typically consists of actual loss experience. If False, the squared triangle, consisting of 100 observations is returned. Default value is True.
- Returns
Either pd.DataFrame, trikit.triangle.IncrTriangle or trikit.triangle.CumTriangle.
CAS Loss Reserving Database datasets are loaded via load_lrdb
. A number of additional keyword arguments can be
passed to load_lrdb
. For example, to retrieve the subset of commercial auto losses for
grcode=1767
:
In [1]: from trikit import load_lrdb
In [2]: df = load_lrdb(grcode=1767, lob="comauto")
In [3]: type(df)
Out[3]: pandas.core.frame.DataFrame
Alternatively, the resulting subset can be transformed to a triangle of the desired type:
In [4]: tri = load_lrdb(tri_type="cum", grcode=1767, lob="comauto")
In [5]: tri
Out[5]:
1 2 3 4 5 6 7 8 9 10
1988 110,231 263,079 431,216 611,278 797,428 985,570 1,174,922 1,366,229 1,558,096 1,752,096
1989 121,678 279,896 456,640 644,767 837,733 1,033,837 1,233,015 1,432,670 1,633,619 nan
1990 123,376 298,615 500,570 714,683 934,671 1,157,979 1,383,820 1,610,193 nan nan
1991 117,457 280,058 463,396 662,003 865,401 1,071,271 1,278,228 nan nan nan
1992 124,611 291,399 481,170 682,203 889,029 1,101,390 nan nan nan nan
1993 137,902 323,854 533,211 753,639 980,180 nan nan nan nan nan
1994 150,582 345,110 561,315 792,392 nan nan nan nan nan nan
1995 150,511 345,241 560,278 nan nan nan nan nan nan nan
1996 142,301 326,584 nan nan nan nan nan nan nan nan
1997 143,970 nan nan nan nan nan nan nan nan nan
Notice that with grcode
and lob
specified as above, the returned DataFrame contains 55 records
as expected (recall that by default, train_ind
is set to False, otherwise the shape of the returned
DataFrame would be (100, 3).
CAS Loss Reserving Database lines of business can be listed by calling get_lrdb_lobs
:
In [1]: from trikit import get_lrdb_lobs
In [2]: get_lrdb_lobs()
Out[2]: ['comauto', 'ppauto', 'wkcomp', 'medmal', 'prodliab', 'othliab']
Unique combinations of “loss_key”, “grname” and “grcode” can be listed by calling get_lrdb_specs
:
In [1]: from trikit import get_lrdb_specs
In [2]: get_lrdb_specs()
Out[2]:
loss_key grcode grname
0 comauto 266 Public Underwriters Grp
1 comauto 337 California Cas Grp
2 comauto 353 Celina Mut Grp
3 comauto 388 Federal Ins Co Grp
4 comauto 460 Buckeye Ins Grp
.. ... ... ...
431 wkcomp 41580 Red Shield Ins Co
432 wkcomp 42439 Toa-Re Ins Co Of Amer
433 wkcomp 43915 Rainier Ins Co
434 wkcomp 44091 Dowa Fire & Marine Ins Co Ltd Us Br
435 wkcomp 44300 Tower Ins Co Of NY
Footnotes