python-benedict/README.md

22 KiB

Py versions PyPI version PyPI downloads Stars Issues License

CircleCI Build Status codecov Codacy Badge Scrutinizer Code Quality Requirements Status

python-benedict

python-benedict is a dict subclass with keypath support, I/O shortcuts (Base64, CSV, JSON, TOML, XML, YAML, query-string) and many utilities... for humans, obviously.

Index

Features

  • Full keypath support using keypath-separator (dot syntax by default) or list of keys.
  • Easy I/O operations with most common formats: Base64, CSV, JSON, TOML, XML, YAML, query-string
  • Many utility and parse methods to retrieve data as needed (all methods listed below)
  • Well tested, check the badges ;)
  • 100% backward-compatible (you can replace existing dicts without pain)

Requirements

  • Python 2.7, 3.4, 3.5, 3.6, 3.7

Installation

  • Run pip install python-benedict

Usage

Basics

benedict is a dict subclass, so it is possible to use it as a normal dictionary (you can just cast an existing dict).

from benedict import benedict

# create a new empty instance
d = benedict()

# or cast an existing dict
d = benedict(existing_dict)

# or create from data source (filepath, url or data-string) in a supported format (base64, json, toml, xml, yaml, query-string)
d = benedict('https://localhost:8000/data.json')

# or in a Django view
params = benedict(request.GET.items())
page = params.get_int('p', 0)

Keypath

. is the default keypath separator.

If you cast an existing dict and its keys contain the keypath separator a ValueError will be raised.

In this case you should use a custom keypath separator or disable keypath functionality.

d = benedict()

# set values by keypath
d['profile.firstname'] = 'Fabio'
d['profile.lastname'] = 'Caccamo'
print(d) # -> { 'profile':{ 'firstname':'Fabio', 'lastname':'Caccamo' } }
print(d['profile']) # -> { 'firstname':'Fabio', 'lastname':'Caccamo' }

# check if keypath exists in dict
print('profile.lastname' in d) # -> True

# delete value by keypath
del d['profile.lastname']

It is possible to do the same using a list of keys:

d = benedict()

# set values by keys list
d['profile', 'firstname'] = 'Fabio'
d['profile', 'lastname'] = 'Caccamo'
print(d) # -> { 'profile':{ 'firstname':'Fabio', 'lastname':'Caccamo' } }
print(d['profile']) # -> { 'firstname':'Fabio', 'lastname':'Caccamo' }

# check if keypath exists in dict
print(['profile', 'lastname'] in d) # -> True

# delete value by keys list
del d['profile', 'lastname']

Custom keypath separator

You can customize the keypath separator passing the keypath_separator argument in the constructor.

If you pass an existing dict to the constructor and its keys contain the keypath separator an Exception will be raised.

d = benedict(existing_dict, keypath_separator='/')

Change keypath separator

You can change the keypath_separator at any time using the getter/setter property.

If any existing key contains the new keypath_separator an Exception will be raised.

d.keypath_separator = '/'

Disable keypath functionality

You can disable the keypath functionality passing keypath_separator=None in the constructor.

d = benedict(existing_dict, keypath_separator=None)

You can disable the keypath functionality using the getter/setter property.

d.keypath_separator = None

API

Utility

These methods are common utilities that will speed up your everyday work.

Utilities that accepts key argument(s) also accepts keypath(s).

Utilities that return a dictionary always return a new benedict instance.

  • clean

# Clean the current dict removing all empty values: None, '', {}, [], ().
# If strings, dicts or lists flags are False, related empty values will not be deleted.
d.clean(strings=True, dicts=True, lists=True)
  • clone

# Return a clone (deepcopy) of the dict.
c = d.clone()
  • dump

# Return a readable representation of any dict/list.
# This method can be used both as static method or instance method.
s = benedict.dump(d.keypaths())
print(s)
# or
d = benedict()
print(d.dump())
  • filter

# Return a filtered dict using the given predicate function.
# Predicate function receives key, value arguments and should return a bool value.
predicate = lambda k, v: v is not None
f = d.filter(predicate)
  • flatten

# Return a flatten dict using the given separator to concat nested dict keys.
f = d.flatten(separator='_')
  • invert

# Return an inverted dict where values become keys and keys become values.
# Since multiple keys could have the same value, each value will be a list of keys.
# If flat is True each value will be a single value (use this only if values are unique).
i = d.invert(flat=False)
  • items_sorted_by_keys

# Return items (key/value list) sorted by keys.
# If reverse is True, the list will be reversed.
items = d.items_sorted_by_keys(reverse=False)
  • items_sorted_by_values

# Return items (key/value list) sorted by values.
# If reverse is True, the list will be reversed.
items = d.items_sorted_by_values(reverse=False)
  • keypaths

# Return a list of all keypaths in the dict.
k = d.keypaths()
print(k)
  • merge

# Merge one or more dictionary objects into current instance (deepupdate).
# Sub-dictionaries keys will be merged toghether.
d.merge(a, b, c)
  • move

# Move an item from key_src to key_dst.
# It can be used to rename a key.
# If key_dst exists, its value will be overwritten.
d.move('a', 'b')
  • remove

# Remove multiple keys from the dict.
# It is possible to pass a single key or more keys (as list or *args).
d.remove(['firstname', 'lastname', 'email'])
  • standardize

# Standardize all dict keys, e.g. "Location Latitude" -> "location_latitude".
d.standardize()
  • subset

# Return a dict subset for the given keys.
# It is possible to pass a single key or more keys (as list or *args).
s = d.subset(['firstname', 'lastname', 'email'])
  • swap

# Swap items values at the given keys.
d.swap('firstname', 'lastname')
  • traverse

# Traverse a dict passing each item (dict, key, value) to the given callback function.
def f(d, key, value):
    print('dict: {} - key: {} - value: {}'.format(d, key, value))
d.traverse(f)
  • unique

# Remove duplicated values from the dict.
d.unique()

I/O

It is possible to create a benedict instance directly from data source (filepath, url or data-string) by passing the data source and the data format (default 'json') in the constructor.

# filepath
d = benedict('/root/data.yml', format='yaml')

# url
d = benedict('https://localhost:8000/data.xml', format='xml')

# data-string
d = benedict('{"a": 1, "b": 2, "c": 3, "x": 7, "y": 8, "z": 9}')

These methods simplify I/O operations with most common formats: base64, csv, json, toml, xml, yaml, query-string

In all from_* methods, the first argument can be: url, filepath or data-string.

In all to_* methods, if filepath='...' kwarg is specified, the output will be also saved at the specified filepath.

  • from_base64

# Try to load/decode a base64 encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to choose the subformat used under the hood (`csv`, `json`, `query-string`, `toml`, `xml`, `yaml`), default: 'json'.
# It's possible to choose the encoding, default 'utf-8'.
# A ValueError is raised in case of failure.
d = benedict.from_base64(s, subformat='json', encoding='utf-8', **kwargs)
  • from_csv

# Try to load/decode a csv encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.ù
# It's possible to specify the columns list, default: None (in this case the first row values will be used as keys).
# It's possible to pass decoder specific options using kwargs: https://docs.python.org/3/library/csv.html
# A ValueError is raised in case of failure.
d = benedict.from_csv(s, columns=None, columns_row=True, **kwargs)
  • from_json

# Try to load/decode a json encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to pass decoder specific options using kwargs: https://docs.python.org/3/library/json.html
# A ValueError is raised in case of failure.
d = benedict.from_json(s, **kwargs)
  • from_query_string

# Try to load/decode a query-string and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# A ValueError is raised in case of failure.
d = benedict.from_query_string(s, **kwargs)
  • from_toml

# Try to load/decode a toml encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to pass decoder specific options using kwargs: https://pypi.org/project/toml/
# A ValueError is raised in case of failure.
d = benedict.from_toml(s, **kwargs)
  • from_xml

# Try to load/decode a xml encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to pass decoder specific options using kwargs: https://github.com/martinblech/xmltodict
# A ValueError is raised in case of failure.
d = benedict.from_xml(s, **kwargs)
  • from_yaml

# Try to load/decode a yaml encoded data and return it as benedict instance.
# Accept as first argument: url, filepath or data-string.
# It's possible to pass decoder specific options using kwargs: https://pyyaml.org/wiki/PyYAMLDocumentation
# A ValueError is raised in case of failure.
d = benedict.from_yaml(s, **kwargs)
  • to_base64

# Return the dict instance encoded in base64 format and optionally save it at the specified 'filepath'.
# It's possible to choose the subformat used under the hood ('csv', json', `query-string`, 'toml', 'xml', 'yaml'), default: 'json'.
# It's possible to choose the encoding, default 'utf-8'.
# It's possible to pass decoder specific options using kwargs.
# A ValueError is raised in case of failure.
s = d.to_base64(subformat='json', encoding='utf-8', **kwargs)
  • to_csv

# Return a list of dicts encoded in csv format and optionally save it at the specified filepath.
# It's possible to specify the key of the item (list of dicts) to encode, default: 'values'.
# It's possible to specify the columns list, default: None (in this case the keys of the first item will be used).
# A ValueError is raised in case of failure.
d = benedict.to_csv(key='values', columns=None, columns_row=True, **kwargs)
  • to_json

# Return the dict instance encoded in json format and optionally save it at the specified filepath.
# It's possible to pass encoder specific options using kwargs: https://docs.python.org/3/library/json.html
# A ValueError is raised in case of failure.
s = d.to_json(**kwargs)
  • to_query_string

# Return the dict instance as query-string and optionally save it at the specified filepath.
# A ValueError is raised in case of failure.
s = d.to_query_string(**kwargs)
  • to_toml

# Return the dict instance encoded in toml format and optionally save it at the specified filepath.
# It's possible to pass encoder specific options using kwargs: https://pypi.org/project/toml/
# A ValueError is raised in case of failure.
s = d.to_toml(**kwargs)
  • to_xml

# Return the dict instance encoded in xml format and optionally save it at the specified filepath.
# It's possible to pass encoder specific options using kwargs: https://github.com/martinblech/xmltodict
# A ValueError is raised in case of failure.
s = d.to_xml(**kwargs)
  • to_yaml

# Return the dict instance encoded in yaml format.
# If filepath option is passed the output will be saved ath
# It's possible to pass encoder specific options using kwargs: https://pyyaml.org/wiki/PyYAMLDocumentation
# A ValueError is raised in case of failure.
s = d.to_yaml(**kwargs)

Parse

These methods are wrappers of the get method, they parse data trying to return it in the expected type.

  • get_bool

# Get value by key or keypath trying to return it as bool.
# Values like `1`, `true`, `yes`, `on`, `ok` will be returned as `True`.
d.get_bool(key, default=False)
  • get_bool_list

# Get value by key or keypath trying to return it as list of bool values.
# If separator is specified and value is a string it will be splitted.
d.get_bool_list(key, default=[], separator=',')
  • get_datetime

# Get value by key or keypath trying to return it as datetime.
# If format is not specified it will be autodetected.
# If choices and value is in choices return value otherwise default.
d.get_datetime(key, default=None, format=None, choices=[])
  • get_datetime_list

# Get value by key or keypath trying to return it as list of datetime values.
# If separator is specified and value is a string it will be splitted.
d.get_datetime_list(key, default=[], format=None, separator=',')
  • get_decimal

# Get value by key or keypath trying to return it as Decimal.
# If choices and value is in choices return value otherwise default.
d.get_decimal(key, default=Decimal('0.0'), choices=[])
  • get_decimal_list

# Get value by key or keypath trying to return it as list of Decimal values.
# If separator is specified and value is a string it will be splitted.
d.get_decimal_list(key, default=[], separator=',')
  • get_dict

# Get value by key or keypath trying to return it as dict.
# If value is a json string it will be automatically decoded.
d.get_dict(key, default={})
  • get_email

# Get email by key or keypath and return it.
# If value is blacklisted it will be automatically ignored.
# If check_blacklist is False, it will be not ignored even if blacklisted.
d.get_email(key, default='', choices=None, check_blacklist=True)
  • get_float

# Get value by key or keypath trying to return it as float.
# If choices and value is in choices return value otherwise default.
d.get_float(key, default=0.0, choices=[])
  • get_float_list

# Get value by key or keypath trying to return it as list of float values.
# If separator is specified and value is a string it will be splitted.
d.get_float_list(key, default=[], separator=',')
  • get_int

# Get value by key or keypath trying to return it as int.
# If choices and value is in choices return value otherwise default.
d.get_int(key, default=0, choices=[])
  • get_int_list

# Get value by key or keypath trying to return it as list of int values.
# If separator is specified and value is a string it will be splitted.
d.get_int_list(key, default=[], separator=',')
  • get_list

# Get value by key or keypath trying to return it as list.
# If separator is specified and value is a string it will be splitted.
d.get_list(key, default=[], separator=',')
  • get_list_item

# Get list by key or keypath and return value at the specified index.
# If separator is specified and list value is a string it will be splitted.
d.get_list_item(key, index=0, default=None, separator=',')
  • get_phonenumber

# Get phone number by key or keypath and return a dict with different formats (e164, international, national).
# If country code is specified (alpha 2 code), it will be used to parse phone number correctly.
d.get_phonenumber(key, country_code=None, default=None)
  • get_slug

# Get value by key or keypath trying to return it as slug.
# If choices and value is in choices return value otherwise default.
d.get_slug(key, default='', choices=[])
  • get_slug_list

# Get value by key or keypath trying to return it as list of slug values.
# If separator is specified and value is a string it will be splitted.
d.get_slug_list(key, default=[], separator=',')
  • get_str

# Get value by key or keypath trying to return it as string.
# Encoding issues will be automatically fixed.
# If choices and value is in choices return value otherwise default.
d.get_str(key, default='', choices=[])
  • get_str_list

# Get value by key or keypath trying to return it as list of str values.
# If separator is specified and value is a string it will be splitted.
d.get_str_list(key, default=[], separator=',')

Testing

# create python 3.8 virtual environment
virtualenv testing_benedict -p "python3.8" --no-site-packages

# activate virtualenv
cd testing_benedict && . bin/activate

# clone repo
git clone https://github.com/fabiocaccamo/python-benedict.git src && cd src

# install requirements
pip install --upgrade pip
pip install -r requirements.txt
pip install tox

# run tests using tox
tox

# or run tests using unittest
python -m unittest

# or run tests using setuptools
python setup.py test

License

Released under MIT License.