Gas Production Profiles in Streamlit App
Dutch Gas production data is publicly available and its abundance makes for an suitable big data project to dissect using Python packages including (geo)pandas, matplotlib, folium, json, (geo)pandas and seaborn. This article explores how an exploratory data analysis tool can be built using Python’s Streamlit as an app framework.
Streamlit is an open-source framework built on Python, which allows for rapid deployment of Python code to an application.
Consolidating data combined with effective visualisation can save hours of time searching databases and extracting useful information. Streamlit’s user-friendly open-source framework allows for neat and elegant deployment of products derived from processed data in Python.
Python’s capabilities allow to gain powerful insights with just of a few lines of code. Subsequently, high level observations about macro (gas production) trends can be made to drive meaningful policy and business decisions. The next sections will cover the datasets and key functionality of the app.
Datasets
The following open-source datasets have been used to build this app:
- NLOG : Production data for fields and wells on- and offshore the Netherlands since 2003. Displayed Production data from Natural (indigenous) Gas contains extraction of natural gas from nature, on-shore and off-shore fields in the Dutch territorial part of the North Sea. Data available from oil, gas and geothermal wells for every month since 2003.
- CBS: National data on gas production, imports (liquefied natural gas (LNG) & natural gas), exports and stock change.
App Functionality
The production data spans over a period of from 2003 to 2021 and can be grouped by Year, Well, Field and Operator and hydrocarbon phase.
Data filters are added to provide customisation to the figures and map. We can dynamically extract unique values from our dataframe that respond to the user inputs. Based on user inputs, the dataframe is filtered and figures are updated.
Dynamic Data Filter
The aim of this project is to gain valuable insights from a large dataset of production profiles by dissecting a large dataset with Python code. The Streamlit app has the ability to generate and store figures on the fly based on user inputs.
The user input filters the data and dynamically updates figures and the map in the app as shown here:
operator_dropdown = ['ALL']+ list(df['OPERATOR'].unique())
operator = st.selectbox('Operator',operator_dropdown)
if operator !='ALL':
df = df.loc[df['OPERATOR']==operator].reset_index(drop=True)
field_dropdown = ['ALL']+ list(df['FIELD'].unique())
field = st.selectbox('Field',field_dropdown)
if field !='ALL':
df = df.loc[df['FIELD']==field].reset_index(drop=True)
well_dropdown = ['ALL']+ list(df['WELL'].unique())
well = st.selectbox('Well',well_dropdown)
if well!='ALL':
df = df.loc[df['WELL']==well].reset_index(drop=True)
In order to understand the relative contributions of components to the aggregate data can be according to the aforementioned filters. In the example below we study relative temporal contribution of individual fields and wells to operator and field production, respectively.
fd['FIELD_PRODUCTION_YEAR'] = fd.groupby(['OPERATOR','YEAR'])['PRODUCTION'].transform('sum')# Calculate the percentage
fd['PRODUCTION_FIELD_SHARE'] = 100*fd['PRODUCTION']/fd['FIELD_PRODUCTION_YEAR']# Check if all numbers add up to 100%
fd.loc[(fd['OPERATOR']=='Dana')&(fd['YEAR']==2009)]['PRODUCTION_FIELD_SHARE'].sum()# Create Plots
def field_fig(field, df, field_op, operator):
colors = {'Dana':'#fee391',
'GDF':'#ef3b2c',
'Kistos':'#e0f3f8',
'NAM':'#41ab5d',
'Neptune': '#fb6a4a',
'ONE-Dyas': '#e6f5d0',
'Petrogas':'#ffffbf',
'Spirit':'#e7298a',
'Taqa':'#74add1',
'Total': '#8c6bb1',
'Vermilion':'#636363',
'Wintershall':'#4292c6',
'ALL':'#ef3b2c'}
if field=='ALL':
field = field_op.loc[field_op['OPERATOR']==operator].reset_index(drop=True)['FIELD'][0]
df = df.loc[df['FIELD']==field]
fig2, ax2 = plt.subplots(figsize=(12,8))
ax2.set_xlim(2003-.5,2021+.5)
ax2.bar(df['YEAR'], df['PRODUCTION'], alpha = 0.5, label = f'{field}',color=colors[operator],edgecolor='black')
ax2.set_title(f'{field} Field Annual Production since 2003', fontsize=25)
ax2.grid()
ax2.legend(loc='upper right', prop = {'size':24})
ax2.tick_params(axis='both', which='major', labelsize=20)
ax2.set_xticks(np.arange(2003,2022,2))
ax2.set_ylabel('Production bln m\u00b3', fontsize=26)
ax2.set_xlabel('Year', fontsize=26)
ax22 = ax2.twinx()
ax22.plot(df['YEAR'], df['PRODUCTION_FIELD_SHARE'], color= 'blue', linewidth=2)
ax22.set_ylabel('Operator Share %',fontsize=23, color='blue')
ax22.tick_params(axis='both', which='major', labelsize=16)
ax22.set_ylim([0,100])
ax22.tick_params(axis='y', colors='blue')
plt.tight_layout()
Show Information When Hovering Over Fields in Map
To visualise data the app contains a folium map with two layers of geospatial data: licenses and field data. Both are supplied as shapefiles and can be converted to geojson data.
If an operator is selected the fields associated with the operator on the map are highlighted.
By hovering over the field one can supplement information such on field name, operator, well count, production share and total production.
def add_fields(m,gdf, colors):
highlight_function = lambda x: {'fillColor': '#000000',
'color':'#000000', 'fillOpacity': 0.80, 'weight': 0.1}for i in range(len(gdf)):
row = gdf.loc[[i]]
feature = folium.features.GeoJson(
row,
style_function=lambda feature: {
'fillColor': getcolor(feature, colors),
'weight': 1,
'opacity':.8,
'color':'#525252',
'fillOpacity': 0.6},
control=False,
zoom_on_click=True,
highlight_function=highlight_function,
tooltip=folium.features.GeoJsonTooltip(
fields=[
'FIELD','OPERATOR', 'WELL_COUNT', '2021_SHARE', 'PRODUCTION_SINCE_2003'], aliases=[
"Field: ",
'Operator: ',
'Well count:',
'2021 National Prod. %: ',
'Production since 2003 bln m\u00b3: '
],
style=("background-color: white; color: #333333; font-family: arial; font-size: 13px; padding: 4px;"),
sticky=True))
m.add_child(feature)
m.keep_in_front(feature)
return m
To access the complete code for the application click on the link below: