docci¶
docci is a package which provides various document management utils to simplify work with files in python-applications (mostly web-applications)
Features¶
File abstraction via FileAttachment class which consists of file name and bytes-content and provides following features:
base64-string creation for file transference in json-apis
Content-Disposition header generation for file name identity in web apps
file name manipulation like extension extraction, mime-type detection
file save on disk - useful when have binary from web and you need to explore it as file on disk
Specific file utilities based on FileAttachment manipulation:
directories exploring - list directory files as list of FileAttachment’s
zip-file exploring - list zip file contents as list of FileAttachment’s
zip-file creation - create zip-archive from list of FileAttachment’s
openpyxl-based xlsx utils like converting xlsx to FileAttachment, xlsx creation from dicts
Usage¶
Firstly, you need to create FileAttachment:
# Creation from pdf
import pdfkit
from docci.file import FileAttachment
pdf_data: bytes = pdfkit.from_file("sample.pdf", output_path=False)
file = FileAttachment("sample.pdf", pdf_data)
# Creation from xlsx
from openpyxl import load_workbook
from docci.file import FileAttachment
from docci.xlsx import xlsx_to_bytes
xlsx = load_workbook("sample.xlsx")
xlsx_data = xlsx_to_bytes(xlsx)
file = FileAttachment("sample.xlsx", xlsx_data)
# Creation from file on disk
from docci.file import FileAttachment
file = FileAttachment.load("path/to/file")
# Creation from base64 str
from docci.file import FileAttachment
file = FileAttachment.load_from_base64("base64-string", "filename")
Now you can use the FileAttachment features:
# To get base64 file representation
file.content_base64
# To generate Content-Disposition header with file name
file.content_disposition
# To get file extension
file.extension
# To get file mimetype
file.mimetype
# To save file to disk
file.save("path/to/file")
Specific file utilities are just functions:
# To get directory files
from docci.file import list_dir_files
files = list_dir_files("path/to/dir")
# To list zip files
from docci.zip import list_zip_files
files = list_zip_files("path/to/zip")
# To create zip-archive
from docci.zip import zip_files
zip_file = zip_files("sample.zip", [file])
# To convert xlsx to FileAttachment
from openpyxl import load_workbook
from docci.xlsx import xlsx_to_file
xlsx_file = xlsx_to_file(load_workbook("path/to/xlsx"), "filename.xlsx")
# To create xlsx from dicts
from docci.xlsx import dicts_to_xlsx
xlsx = dicts_to_xlsx([
{"col1": 1, "col2": 2},
{"col1": 3, "col2": 4}
])
More features can be found in api reference below
API reference¶
docci.file¶
Utils for file manipulations like extracting file name from path
-
class
docci.file.
FileAttachment
(name: str, content: bytes)¶ Class for file abstraction
- Parameters
name – file name. Restricted symbols (like
*/:
) and directory path (/opt/data/test.txt
>test.txt
) will be removed from the file name.content – binary file content
-
property
content_base64
¶ Convert content to base64 binary string
-
property
content_disposition
¶ Convert file name to urlencoded Content-Disposition header
>>> FileAttachment("sample.py", b"").content_disposition {'Content-Disposition': 'attachment; filename=sample.py'} >>> FileAttachment("98 - February 2019.zip", b"").content_disposition {'Content-Disposition': 'attachment; filename=98%20-%20February%202019.zip'}
-
property
content_json
¶ Return content as dict with base64 content
-
property
content_stream
¶ Return file attachment content as bytes stream
-
property
extension
¶ >>> FileAttachment("sample.py", b"").extension 'py'
-
classmethod
load
(path: str) → docci.file.FileAttachment¶ Load file from disk
-
classmethod
load_from_base64
(base64_str: Union[str, bytes], name: str) → docci.file.FileAttachment¶ Load file from base64 string
-
property
mimetype
¶ Guess mimetype by extension.
-
property
name_without_extension
¶ >>> FileAttachment("sample.py", b"").name_without_extension 'sample'
-
save
(path: Optional[str] = None) → None¶ Save file to disk
-
docci.file.
extract_file_name
(path: str) → str¶ Extract file name from path, works to directories too
>>> extract_file_name("tests/test_api.py") 'test_api.py' >>> extract_file_name("tests/test") 'test'
-
docci.file.
list_dir_files
(directory: str) → Tuple[str, Iterable[docci.file.FileAttachment]]¶ List directory files, return Directory - tuple of dir name and list of dir files
-
docci.file.
normalize_name
(raw_name: str, with_file_name_extract: bool = True) → str¶ Extract file name, remove restricted chars
>>> normalize_name('op/"oppa".txt') 'oppa.txt' >>> normalize_name('op/"oppa".txt', with_file_name_extract=False) 'opoppa.txt'
docci.xlsx¶
Utils for working with openpyxl.Workbook
-
docci.xlsx.
dicts_to_xlsx
(dicts: Sequence[Dict], headers: Sequence[str] = None) → openpyxl.workbook.workbook.Workbook¶ Create openpyxl.Workbook with rows of {dicts} values.
- Parameters
dicts – List of dicts to insert
headers – List of headers if None dict keys would be used.
- Returns
openpyxl.Workbook
-
docci.xlsx.
xlsx_from_bytes
(bytes_: bytes) → openpyxl.workbook.workbook.Workbook¶ Create xlsx from bytes.
-
docci.xlsx.
xlsx_from_file
(file: docci.file.FileAttachment) → openpyxl.workbook.workbook.Workbook¶ Create xlsx from FileAttachment
-
docci.xlsx.
xlsx_to_bytes
(xlsx: openpyxl.workbook.workbook.Workbook) → bytes¶ Convert openpyxl.Workbook to bytes
-
docci.xlsx.
xlsx_to_file
(xlsx: openpyxl.workbook.workbook.Workbook, name: str) → docci.file.FileAttachment¶ Convert openpyxl.Workbook to FileAttachment
docci.zip¶
Utils for working with zip archives
-
docci.zip.
list_zip_files
(raw_zip_file: Union[str, bytes, _io.BytesIO, zipfile.ZipFile, docci.file.FileAttachment]) → Sequence[docci.file.FileAttachment]¶ List zip archive files
-
docci.zip.
raw_to_zip
(raw_zip_file: Union[str, bytes, _io.BytesIO, zipfile.ZipFile, docci.file.FileAttachment]) → zipfile.ZipFile¶ Convert path, bytes, stream, FileAttachment to ZipFile.
-
docci.zip.
zip_dirs
(dirs: Iterable[Tuple[str, Iterable[FileAttachment]]], zip_name: str) → docci.file.FileAttachment¶ Zip folders into single zip archive with {zip_name}
-
docci.zip.
zip_files
(files: Iterable[docci.file.FileAttachment], zip_name: str) → docci.file.FileAttachment¶ Zip files to archive with {zip_name}
Development & contribution¶
Publishing to PYPI¶
Bump version:
poetry version major/minor/patch
Build and publish package:
poetry publish --build
Published package can be found here: https://pypi.org/project/docci/