Category: Blog

  • nedextract

    github repo badge github license badge RSD fair-software.eu Build Coverage Status cffconvert markdown-link-check OpenSSF Best Practices DOI

    Nedextract

    nedextract is being developed to extract specific information from annual report PDF files that are written in Dutch. Currently it tries to do the following:

    • Read the PDF file, and perform Named Entity Recognition (NER) using Stanza to extract all persons and all organisations named in the document, which are then processed by the processes listed below.

    • Extract persons: using a rule-based method that searches for specific keywords, this module tries to identify:

      • Ambassadors

      • People in important positions in the organisation. The code tries to determine a main job description (e.g. director or board) and a sub-job description (e.g. chairman or treasurer). Note that these positions are identified and outputted in Dutch.
        The main jobs that are considered are:

        • directeur
        • raad van toezicht
        • bestuur
        • ledenraad
        • kascommissie
        • controlecommisie.

        The sub positions that are considered are:

        • directeur
        • voorzitter
        • vicevoorzitter
        • lid
        • penningmeester
        • commissaris
        • adviseur

      For each person that is identified, the code searches for keywords in the sentences in which the name appears to determine the main position, or the sentence directly before or after that. Subjobs are determine based on words appearing directly before or after the name of a person for whom a main job has been determined. For the main jobs and sub positions, various ways of writing are considered in the keywords. Also before the search for the job-identification starts, name-deduplication is performed by creating lists of names that (likely) refer to one and the same person (e.g. Jane Doe and J. Doe).

    • Extract related organisations:

      • After Stanza NER collects all candidates for mentioned organisations, postprocessing tasks try to determine which of these candidates are most likely true candidates. This is done by considering: how often the terms is mentioned in the document, how often the term was identified as an organisation by Stanza NER, whether the term contains keywords that make it likely to be a true positive, and whether the term contains keywords that make it likely to be a false positive. For candidates that are mentioned only once in the text, it is also considered whether the term by itself (i.e. without context) is identified as an organisation by Stanza NER. Additionally, for candidates that are mentioned only once, an extra check is performed to determine whether part of the candidate org is found to be a in the list of orgs that are already identified as true, and whether that true org is common within the text. In that case the candidate is found to be ‘already part of another true org’, and not added to the true orgs. This is done, because sometimes an additional random word is identified by NER as being part of an organisation’s name.
      • For those terms that are identified as true organisations, the number of occurrences in the document of each of them (in it’s entirety, enclosed by word boudaries) is determined.
      • Finally, the identified organisations are attempted to be matched on a list of provided organisations using the anbis argument, to collect their rsin number for further analysis. An empty file ./Data/Anbis_clean.csv is availble that serves as template for such a file. Matching is attempted both on currentStatutoryName and shortBusinessName. Only full matches (independent of capitals) and full matches with the additional term ‘Stichting’ at the start of the identified organisation (again independent of capitals) are considered for matching. Fuzzy matching is not used here, because during testing, this was found to lead to a significant amount of false positives.
    • Classify the sector in which the organisation is active. The code uses a pre-trained model to identify one of eight sectors in which the organisation is active. The model is trained on the 2020 annual report pdf files of CBF certified organisations.

    Prerequisites

    1. Python 3.8, 3.9, 3.10, 3.11
    2. Poppler; poppler is a prerequisite to install pdftotext, instructions can be found here: https://pypi.org/project/pdftotext/. Please note that to install poppler on a Windows machine using conda-forge, Microsoft Visual C++ build tools have to be installed first.

    Installation

    nedextract can be installed using pip:

    pip install nedextract

    The required packages that are installed are: FuzzyWuzzy, NumPy, openpyxl, poppler, pandas, pdftotext, python-Levenshtein, scikit-learn, Stanza, and xlsxwriter.1

    Usage

    Input

    The full pipeline can be executed from the command line using: python3 -m nedextract.run_nedextract Followed by one or more of the following arguments:

    • Input data, one or more pdf files, using one of the following arguments:
      • -f file: path to a single pdf file
      • -d directory: path to a directory containing pdf files
      • -u url: link to a pdf file
      • -uf urlf: text file containing one or multiple urls to pdf files. The text file should contain one url per line, without headers and footers.
    • -t tasks (optional): can either be ‘people’, ‘orgs’, ‘sectors’ or ‘all’. Indicates which tasks to be performed. Defualts to ‘people’.
    • -a anbis (option): path to a .csv file which will be used with the orgs task. The file should contain (at least) the columns rsin, currentStatutoryName, and shortBusinessName. An empty example file, that is also the default file, can be found in the folder ‘Data’. The data in the file will be used to try to match identified named organisations on to collect their rsin number provided in the file.
    • model (-m), labels (-l), vectors (-v) (optional): each referring to a path containing a pretraining classifyer model, label encoding and tf-idf vectors respectively. These will be used for the sector classification task. A model can be trained using the classify_organisation.train function.
    • -wo write_output: TRUE/FALSE, defaults to TRUE, setting weither to write the output data to an excel file.

    For example: python3 -m nedextract.run_nedextract -f pathtomypdf.pdf -t all -a ansbis.csv

    Returns:

    Three dataframes, one for the ‘people’ task, one for the ‘sectors’ task, and one for the ‘orgs’ task. If write_output=True, the gathered information is written to auto-named xlsx files in de folder Output. The output of the different tasks are written to separate xlsx files with the following naming convention:

    • ‘./Output/outputYYYYMMDD_HHMMSS_people.xlsx’
    • ‘./Output/outputYYYYMMDD_HHMMSS_related_organisations.xlsx’
    • ‘./Output/outputYYYYMMDD_HHMMSS_general.xlsx’

    Here YYYYMMDD and HHMMSS refer to the date and time at which the execution started.

    Turorials

    Tutorials on the full pipeline and (individual) useful analysis tools can be found in the Tutorials folder.

    Contributing

    If you want to contribute to the development of nedextract, have a look at the contribution guidelines.

    How to cite us

    DOI RSD

    If you use this package for your scientific work, please consider citing it as:
    Ootes, L.S. (2023). nedextract ([VERSION YOU USED]). Zenodo. https://doi.org/10.5281/zenodo.8286578
    See also the Zenodo page for exporting the citation to BibTteX and other formats.

    Credits

    This package was created with Cookiecutter and the NLeSC/python-template.

    Footnotes

    1. If you encounter problems with the installation, these often arise from the installation of poppler, which is a requirement for pdftotext. Help can generally be found on pdftotext.

    Visit original content creator repository
  • forgefed

    ForgeFed

    Get it on Codeberg

    ForgeFed is an ActivityPub-based federation protocol for software forges. You can read more about ForgeFed and the protocol specification on our website.

    Contributing

    There’s a huge variety of tasks to do! Come talk with us on the forum or chat. More eyes going over the spec are always welcome! And feel free to open an issue if you notice missing details or unclear text or have improvement suggestions or requests.

    However, to maintain a manageable working environment, we do reserve the issue tracker for practical, actionable work items. If you want to talk first to achieve more clarity, we prefer you write to us on the forum or chat, and opening an issue may come later.

    If you wish to join the work on the ForgeFed specification, here are some technical but important details:

    • We don’t push commits to the main branch, we always open a pull request
    • Pull requests making changes to the specification content must have at least 2 reviews and then they wait for a cooldown period of 2 weeks during which more people can provide feedback, raise challenges and conflicts, improve the proposed changes etc.
    • If you wish to continuously participate in shaping the specification, it would be useful to go over the open PRs once a week or so, to make sure you have a chance to communicate your needs, ideas and thoughts before changes get merged into the spec

    Important files in this repo to know about:

    • The file resources.md lists which team members have access to which project resources, openness and transparency are important to us!
    • The actual specification source texts are in the spec/ directory
    • JSON-LD context files are in the rdf/ directory

    Repo mirrors

    Website build instructions

    The ForgeFed website is generated via a script using the Markdown files in this repository. See ./build.sh for more details.

    License

    All contents of this repository are are freely available under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication.

    The ForgeFed logo was created by iko.

    Historical resources

    ForgeFed started its life on a mailing list. The old ForgeFed forum at talk.feneas.org can be viewed via the Internet Archive’s Wayback Machine.

    Funding

    This project is funded through the NGI Zero Entrust Fund, a fund established by NLnet with financial support from the European Commission’s Next Generation Internet program. Learn more at the NLnet project page.

    NLnet foundation logo NGI Zero Entrust Logo

    Visit original content creator repository
  • Nextjs-Dashboard

    Nextjs-Dashboard

    Handicraft Dashboard

    Nextjs 15 (rc) – TypeScript – Tailwind – PostgreSQL

    Dashboard Img

    Introduction

    Although this application seems complete, I focused on the administrator dashboard features, in order to create something both useful and original. I also used NextAuth v5 to see how it’s possible to log in as both user and administrator. In the real world, you’d have to use Lucia instead.

    I’m interested in :

    • capturing the public IP and then using it for geolocation.
    • how to retrieve users’ browser and os system data and display them in graphs.

    Display :

    • product stocks in order from smallest to largest.
    • connected users.
    • messages, connections and sales by day, month and year.
    • tasks set by administrators on all pages of the application.

    Goals

    Login as User or Admin with NextAuth V5 without api (GitHub & Google)

    • Administrator can access to dashboard.
    • User & Admin can access to products & payment.

    User:

    • main page
    • products
    • contact (possibility to send message to admin)

    Admin:

    • main page
    • dashboard

    Dashboard with multiple management system:

    • message
    • statistics
    • users
    • products (best sellers & stock)
    • bilan

    Fetch the public IP from user

    Retrieve the public IP & determine the location by latitude & longitude with react-leaflet map.

    https://jsonip.com/

    Fetch to retrieve latitude & longitude with SECRET_API_KEY & publicIp to customize url, such as:

    https://api.ip2location.io/?key=${SECRET_API_KEY}&ip=${publicIp}

    (You can use the api free of charge with https://www.ip2location.io/).

    I had some problems, many times, with a window is undefined error. To solve this problem in my RSC (React Server Component), I simply added :

    export const dynamic = "force-dynamic";

    Data are displayed under Network link in Dashboard.


    Retrieve Browser & OS from users

    Display them to user & write them into a file

    Useful link: window.navigator.userAgent

    Data are written into:

    • /app/api/profile/browseros/route.ts

    Data are saved into:

    • /utils/browseros-data.json
    • /utils/ip-data.json

    Data are displayed in charts

    • components/menu-items/graphs/BarChartBrowser.tsx
    • components/menu-items/graphs/BarChartOs.tsx

    Manage products from store as ADMIN with server-actions & postgresql (prisma)

    dashboard (admin)

    1. Upadate/Modify
    2. Delete
    3. Create
    • /components/menu-items/admin-products/ModifyProduct.tsx
    • /components/menu-items/admin-products/CreateProduct.tsx

    (They have the same route)


    Messages

    User can send message to Admin & management system message for Admin

    contact (user)

    • Write & send a message to admin.

    dashboard (admin)

    • Open/close messages
    • Read & write message to response to users.

    Data

    Retrieve data from db & data.json to display values in charts

    dashboard (admin)

    • Messages
    • Network
    • Statistics (nb connection to site per day, os, browser, satisfaction)
    • Store of products (create – delete – update)
    • Bilan

    Configuration in .env

    POSTGRES_HOST=127.0.0.1
    POSTGRES_PORT=PPPPPPPPP
    POSTGRES_USER=UUUU
    POSTGRES_PASSWORD=XXXX
    POSTGRES_DB=DBDBDBD
    
    DATABASE_URL="postgresql://UUUU:XXXX@localhost:PPPPPPPPP/DBDBDBD?schema=public"
    
    # use: "openssl rand -base64 32"
    AUTH_SECRET="result of cmd above"
    NEXTAUTH_URL=http://localhost:3000
    
    # build mode require this setting:
    AUTH_TRUST_HOST=true
    

    Don’t forget to configure .gitignore to avoid share sensitive data.

    add .env into .gitignore & save the file.


    Authentication with next-auth@beta

    I wanted build a system login without external API like Google or GitHub to give different access as user or admin.

    All files that include NextAuth V5:

    • app/api/auth/[…nextauth]/route.ts
    • /app/auth/…
    • middleware.ts
    • prisma/prisma.ts

    Security

    Use next-safe-action with zod & zod-form-data, to secure request of server action (avoid to display sensitive data). It’s interacts with the middleware.

    • /lib/actions.ts
    • /lib/safe-action.ts

    Extra

    I create a shop as an e-commerce to combine zustand with prisma request. Just to understand how works prisma table (in this ctx) & how to initialize products in the zustand store. I don’t used stripe, because that wasn’t my goal.


    Installation

    $ pnpm add sharp

    $ pnpm add react-icons

    $ pnpm tailwindcss-animate

    $ pnpm add chart.js react-chartjs-2

    $ pnpm add leaflet

    $ pnpm add react-leaflet

    (not required @types/react-leaflet = deprecated)

    $ pnpm add zustand

    $ pnpm add @tanstack/react-query

    $ pnpm add @tanstack/react-query-devtools

    $ pnpm add jsonwebtoken

    $ pnpm add @types/jsonwebtoken

    $ pnpm add react-hook-form

    $ pnpm add zod @hookform/resolvers

    $ pnpm add zod-form-data

    $ pnpm add @hookform/error-message

    $ pnpm add next-auth@beta @auth/prisma-adapter

    $ pnpm add @prisma/client

    $ pnpm add -D prisma

    $ pnpm prisma init --datasource-provider postgresql

    (create db & table with PostgreSQL)

    $ pnpm prisma migrate dev --name init

    (pnpm prisma db push (schema))

    (pnpm prisma db seed (seed.ts))

    $ pnpm add bcryptjs

    $ pnpm add -D @types/bcryptjs

    $ pnpm add react-hot-toast

    $ pnpm add next-safe-action

    Video Youtube

    IMAGE ALT TEXT HERE

    Ref

    • NextAuth V5:

    auth.ts

    • If you get some trouble with prisma migration schema, follow this link:

    prisma-migrate


    Enjoy it ! 🐨:

    Visit original content creator repository
  • elastic-alexa

    Elasticsearch Alexa skill

    Skill for Amazon echo to enable Alexa to talk to Elasticsearch.

    Current possible interaction

    Configured IntentSchema:

    ElasticCount Count {emptyTerm|term}
    

    Explanation:

    1. Search for term in elasticsearch and count result set

    Example:

    Alexa? Ask Elastic to count error
    

    is transformed to skill (intent) and variable configuration (slots):

    intent=ElasticSearch
    slot(term)=error
    

    Note: Data type number can be translated from five to 5 directly.

    Java application called by alexa

    Amazon provided a nice SDK and a nice way to interact with alexa. After registering your skill to amazon developer console, your endpoint get called with relevant payload. I decided to use a spring boot application handling these requests. Java code is in src, relevant business logic is included in

    src/main/java/info/unterstein/alexa/elastic/alexa/ElasticSpeechlet.java
    

    Get this app up and running

    Currently you need to configure the target ElasticSearch cluster within code. This should be changed to be configured during installing this skill to amazon echo, see section Option issues. But, for now, you need to go to

    src/main/java/info/unterstein/alexa/elastic/ElasticSpeechlet.java
    

    and do something like:

      // TODO
      public ElasticSpeechlet.java() {
        client = new ElasticSearchClient("your.elastic.url", 9300, "your.cluster.name");
      }
    

    Then you need to package this app and start it somewhere:

    mvn clean package
    # deploy it somewhere with following command
    java -jar elastic-alexa-0.0.1-SNAPSHOT.jar --server.port=19002
    

    Walkthrough amazon developer console

    Step 1: Skill information

    alt text

    Step 2: Interaction model

    alt text

    Text entered:

    speechAssets/IntentSchema.json
    speechAssets/SampleUtterances.txt
    

    Step 3: Configuration

    alt text

    I needed an http endpoint with valid ssl certificate. You can choose between onprem installation or AWS lamba. I decided to deployed the app directly to my server, proxied behind NGINX using the following configuration:

    server {
            listen 443 ssl;
            server_name unterstein.info;
    
    ...
    
            ssl_certificate      /etc/nginx/ssl/unterstein.info.crt;
            ssl_certificate_key  /etc/nginx/ssl/unterstein.info.key;
    
    ...
    
            location /alexa {
                    proxy_pass http://127.0.0.1:19002/alexa;
                    proxy_set_header Host $host;
                    proxy_set_header X-Real-IP $remote_addr;
                    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            }
    }
    
    

    Step 4: SSL Certificate

    alt text

    Step 5: Test

    alt text

    At this point it is possible to enable this skill for all amazon echos, registered to the current amazon account and can be used directly.

    Short demo video

    https://twitter.com/unterstein/status/832302202702196736

    Useful reads

    Open issues

    Visit original content creator repository
  • OTP-Raspberry-Pi

    Generatore chiavi casuali

    È stato pensato per monitorare i dati ambientali attraverso una raspberry ed introdurre l’entropia sufficiente nel programma.

    Quello che viene eseguito è un ciclo infinito per incrementare una variabile in un range fissato e leggere lo stato della variabile
    quando si verificano le condizioni ambientali stocastiche come piccole variazioni della pressione atmosferica.

    Viene cosi prodotta una sequenza casuale, dimostrabile attraverso l’istogramma delle frequenze si osservano solo distribuzioni uniformi sulle chiavi generate.

    One-Time-Pad

    È il tentativo di implementare un cifrario a flusso di tipo OTP attraverso il fix delle falle crittografiche del noto RC4, cercando di far coesistere la sicurezza dei cifrari OTP alla praticità dei cifrari più moderni come AES.

    Un cifrario OTP è un cifrario perfetto perchè è matematicamente sicuro.

    L’idea di base è quella di usare il cifrario Vigenere, insicuro di per sè, ed imporre particolari condizioni sulle chiavi per creare un nuovo cifrario, di tipo OTP, chiamato Vernam.

    Le condizioni sono:

    1. chiave crittografica lunga quanto il testo in chiaro
    2. casualità della chiave
    3. ad ogni testo in chiaro da cifrare deve corrispondere una chiave diversa (One Time Pad)

    Il cifrario cosi definito resiste anche al bruteforce delle chiavi con potenza di calcolo infinita, perchè implementa il concetto di crittografia negabile, in quanto nel processo di crittoanalisi si estrarrebbero tutti i possibili testi di senso compiuto e non si potrebbe dire quale messaggio sia stato veramente scambiato.

    sincVernam.py

    In sincVernam.py non si usa la matrice di Vigenere per la codifica ma l’operatore XOR per estendere l’alfabeto a tutti i char.

    Essendo il cifrario Vernam di difficile implementazione per l’onerosa gestione delle chiavi, si cerca di adottare dei compromessi.

    Il processo di crittografia è il seguente:

    1. richiesta di una password
    2. generazione dell’hash crittografico della password da usare come seed
    3. inizializzazione di un generatore di numeri peseudocasuali crittograficamente sicuro usando l’hash precedente come seed
    4. generare una sequenza di numeri pseudocasuali ma crittograficamente sicura da usare come chiave crittografica
    5. eseguire lo XOR tra testo in chiaro e chiave crittografica
    6. incrementare un contatore da appendere alla password iniziale per generare chiavi crittografiche sempre diverse
    7. iterare i passaggi precedenti su messaggi in chiaro nuovi

    Vengono quindi soddisfatte tutte le condizioni del cifrario Vernam:

    1. la prima implementando un generatore che garantisce la lunghezza della chiave con il minimo sforzo
    2. la casualità non c’è per consentire di ricavare le chiavi crittografiche a partire da una password, ci si avvicina alla sicurezza OTP per l’uso di un generatore pseudocasuale crittograficamente sicuro
    3. per ogni nuovo messaggio in chiaro viene derivata una nuova chiave crittografica impossibile da ricavare senza conoscere la password iniziale

    Inoltre viene calcolato un hash di integrità del messaggio ed appeso al testo cifrato.

    Si potrebbe anche appendere alla fine del testo cifrato il contatore per garantire la sincronia e poter correggere errori di sincronizzazione nel caso in cui qualche messaggio venga perso. Il valore di questo contatore può essere pubblico svolgendo il ruolo logico di salt crittografico per la derivazione di nuove password.

    Viene anche impostato un tempo di delay casuale di elaborazione dentro alle varie funzioni per mitigare attacchi di timing ed è aggiunto al messaggio in chiaro un timestamp con data e ora della cifratura per mitigare gli attacchi di replica.

    Crittoanalisi

    1. L’unica crittoanalisi nota è sulla password, punto in cui il cifrario è più vulnerabile ad attacchi di bruteforce per esempio, ritenuti mitigabili però attraverso la forza della password scelta come per altri cifrari ritenuti sicuri come AES. Si consiglia di usare il cifrario all’interno di adeguati standard e protocolli sulla gestione delle password.

    2. La crittografia negabile si ottiene trasformando il cifrario a flusso in un cifrario a blocchi che contengano un numero di char in chiaro uguali a quelli del digest dell’algoritmo di hashing crittografico usato, come per esempio sha512, e la derivazione di nuovi hash per ogni blocco di testo in chiaro.

    OTP.py

    Il programma riprende la versione fixata di sincVernam.py ed aggiunge la possibilità di usare, oltre ad una password, un file di dati casuali come chiave crittografica, bypassando a derivazione delle chiavi generate con un generatore pseudocasuale.

    La generazione di chiavi pseudocasuali subentra nel momento in cui viene esaurito il file usato come chiave crittografica, perchè ad ogni byte di messaggio cifrato o decifrato corrisponde una riduzione di un byte del file chiave tramite il suo troncamento, evitandone il reimpiego e garantendo la sicurezza OTP.

    L’uso sicuro richiede lo sviluppo di un protocollo ed un framework all’interno del quale avviene la gestione delle chiavi casuali e della sincronizzazione delle comunicazioni.

    Visit original content creator repository

  • integrationapp-hubspot-example

    Nextjs + Integration app + Hubspot Example

    This example demonstrates how to use Integration.app with Hubspot. To test the integration, you need a Hubspot account. If you don’t have one, you can create it here.

    Screenshot 2025-01-01 at 00 40 53

    Prerequisites

    Clerk (Authentication)

    We use clerk for authentication. You need to create an account in Clerk. Once you have an account, you need to create a new application. You will need your public and secret key (NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY and CLERK_SECRET_KEY) to authenticate get authentication working in the app.

    Integration app (Data Integration)

    Integration.app lets to integrate data into your app from different data sources. You can sign up for a free account here.

    Once your account is created you will need to setup add the Hubpot app and create two actions:

    • list-contacts (to list all the contacts)
    • create-contact (to create a new contact)

    If you actions have different names, you’ll need to update your .env.local file with the correct action names.

    You can also import this template into your workspace: Import template — This doesn’t actually work yet, it’s only a proposal.

    Installation

    1. Clone the repo

      git clone
    2. Install NPM packages

      npm install
    3. Enter your Integration.app and Clerk keys in .env.local. Copy the .env.local.example file and rename it to .env.local to get started:

    cp .env.local.example .env.local
    1. Run the app
      npm run dev
    2. Open http://localhost:3000 to view it in the browser.

    Suggested Improvements to integration.app

    • Enhance TypeScript support by developing a CLI tool that generates types based on the actions schema.
    • Provide extensive usage examples to help users understand various use cases.
    • Establish a Discord community to offer support and foster engagement with customers.
    • Improve the existing documentation to make it more comprehensive.
    • Ensure the authenticated workspace ID is passed back to the URI as current Hubspot URI is missing the workspace ID.
    • Fix the UI issue that prevents creating an output schema from being created, as the interface is currently unstable.
    • Implement a feature to clone actions, which would be beneficial for testing and experimenting with different examples.

    Todo

    • Use integration.app pagination in the UI
    • Make UI responsive

    Visit original content creator repository

  • SmartI18N

    SmartI18N

    SmartI18N will die Internationalisierung von Webprojekten vereinfachen und die Umsetzung beschleunigen. Es gibt einen Online Editor mit welchem Texte in Echtzeit angepasst und verändert werden können. Für die Einbindung stehen verschiedene SDK’s zur Verfügung die das Arbeiten mit SmartI18N vereinfachen.

    Mehr Informationen zu SmartI18N findest du unter www.smarti18n.com

    🚨 This project is now officially retired! 🧓🪦

    Development has ended, support has vanished, and the code is now living its best life in a quiet repo somewhere, sipping digital margaritas. 🍹

    Feel free to fork it, remix it, or just stare at it nostalgically — but don’t expect it to do any new tricks. It’s not dead… it’s just resting. 😴

    First Steps

    Für den Start empfehlen wir dir, den SmartI18N Server von www.smarti18n.com zu nutzen. Später kannst du mit Hilfe eines Docker Image SmartI18N selbst hosten. Dazu aber später mehr.

    Erstell dir mit Hilfe des Editors einen Account und erstell anschließend ein Projekt. Mithilfe des Projekt Keys und Secrets kannst du dann SmartI18N in dein Projekt einbinden.

    SDK’s

    Derzeit gibt es für Spring Message Sources und AngularJS Schnittstellen. In Zukunft wollen wir weiter Schittstellen ergänzen.

    Spring Framework SDK

    Example folgt

    AngularJS SDK

    Example folgt

    Docker Image

    SmartI18N kann selbst betrieben werden. Mehr Informationen findest du im Docker HUB.

    MongoDB (optional)

    Für den Betrieb von SmartI18N benötigst du eine MongoDB. Du kannst eine extern gehostete Instance verwenden oder einen Docker Container.

    docker run -d --name smarti18n-mongo mongo
    

    smarti18n-messages

    docker run -d --name smarti18n-messages --link smarti18n-mongo:mongo -p 30001:8080 -e MONGODB_URL=mongodb://mongo/smarti18n-messages  smarti18n/messages
    

    smarti18n-editor

    docker run -d --name smarti18n-editor -p 30002:8080 -e "SMARTI18N_MESSAGES_HOST=http://localhost:30001" smarti18n/editor
    

    First Login

    Jetzt kannst du das initiale Admin Passwort aus dem smarti18n-messages Container suchen.

    docker logs smarti18n-messages
    

    #######################################################################
    Initializing Application
    Opened connection \[connectionId{localValue:2, serverValue:2}\] to mongo:27017
    Create Default User \[default@smarti18n.com\] with Password \[PASSWORD\]
    create default project \[default\] with secret \[SECRET\]
    Initializing Application finished
    #######################################################################
    

    Mit dem Passwort und der E-Mail default@smarti18n.com kannst du dich im Editor unter http://localhost:30002 einloggen.

    License

    SmartI18n is released under version 2.0 of the Apache License.

    Visit original content creator repository

  • starter-architect

    Please Don’t Use

    Please use npm init remix instead of this starter repo to create a new Remix app.
    This repository was archived on April 29, 2021.

    Remix Starter for Architect (AWS CloudFormation)

    Welcome to Remix!

    This is a starter repo for using Remix with Architect (wrapper around AWS CloudFormation).

    Development

    When developing your app, you’ll need two terminal tabs, one to run Architect’s sandbox, and the other to run the Remix development server. In production, however, the Remix development server won’t be used because your assets will be built and shipped with the server.

    First, .npmrc.example to .npmrc and insert the license key you get from logging in to your dashboard at remix.run.

    Note: if this is a public repo, you’ll probably want to move the line with
    your key into ~/.npmrc to keep it private.

    Next, install all dependencies using npm:

    npm install

    Your @remix-run/* dependencies will come from the Remix package registry.

    Remix Development Server

    Once everything is installed, start the Remix asset server with the following command:

    npm run dev

    The dev server automatically rebuilds as your source files change.

    Architect Sandbox

    Architect recommends installing their CLI and the AWS sdk globally:

    $ npm i -g @architect/architect aws-sdk

    Now start the sandbox:

    $ arc sandbox

    You should now be able to visit http://localhost:3333.

    Deploying

    First, you’ll need to have the AWS CLI installed, here are the instructions. Then follow the Architect setup instructions: https://arc.codes/docs/en/guides/get-started/detailed-aws-setup.

    Now you’re ready to deploy. From the Remix http handler directory, build the app for production:

    $ npm run build

    And then from the root of the project, deploy with arc.

    $ arc deploy

    That’s it!

    Documentation

    Detailed documentation for Remix is available at remix.run.

    Visit original content creator repository

  • SalahKart

    Internship Experience

    Software Engineer Intern

    SalahKart- A Product of Jaspy Technologies Pvt. Ltd.

    CIN: U78100UP2024PTC199951

    Duration: 2 months

    Roles and Responsibilities:

    • NLP and Web Scraping:
      • Employed BeautifulSoup and Selenium WebDriver to scrape data from LinkedIn profiles.
      • Conducted resume parsing, extracting skills from resumes and job descriptions to calculate the matching percentage.
    • Skill Matching and Semantic Textual Similarity (STS):
      • Implemented three primary methods for skill matching:
        • SentenceTransformer Model
        • CrossEncoder Model
        • Dictionary Method using a predefined skill list from a CSV file.
      • Combined the methods using a normalized formula to compute the final skill matching percentage.
      • Conducted Semantic Textual Similarity (STS) for accurate skill assessment.
    • Natural Language Processing (NLP):
      • Utilized SpaCy and NLTK for advanced NLP tasks.
      • Developed a Named Entity Recognition (NER) pipeline using Transformers by Hugging Face.
      • Extracted nouns and verbs from resumes on a section-wise basis for enhanced resume analysis.
    • Development and Testing:
      • Developed and tested code in Python using NumPy, Pandas, and regular expressions.
      • Used VS Code and Google Colab for coding and collaboration.
      • Designed and documented workflow using Lucid Chart for flow diagrams and brainstorming sessions.
      • Created a backend database with a Level 2 ER Diagram using the Eraser.io.
      • Performed unit and integration testing manually to ensure code quality and reliability.

    Technical Skills:

    • Programming Languages: Python
    • Libraries & Frameworks: SpaCy, NLTK, NumPy, Pandas, BeautifulSoup, Selenium WebDriver
    • Machine Learning Models: SentenceTransformer, CrossEncoder
    • Tools: VS Code, Google Colab, Lucid Chart, Eraser.io
    • Database Design: Level 2 ER Diagrams
    • Testing: Manual Unit and Integration Testing
    • Additional Skills: Transformers, NLP, Web Scraping, Semantic Textual Similarity (STS), Named Entity Recognition (NER)

    Alert

    Provided Codes don’t represent my complete work during internship and are Incomplete and partial. It’s only for preview and educational purposes. These are not the Final production grade codes either. Kindly use it with caution ⚠

    Visit original content creator repository

  • Waymaker

    background2 jpg

    Sobre o Projeto

    O Waymaker é um projeto pessoal sobre aplicativos direcionado aos usuários de transporte público(usuários frequentes ou esporádicos). Tal iniciativa teve na minha experiência como usuário desse mesmo sistema o ponto de partida para o projeto. A ideação do projeto começou a partir do principal questionamento, “Como uma pessoa que desconhece um determinado local é capaz de se locomover por ele?”. Assim foi feita uma breve pesquisa com alguns usuários utilizando-se de técnicas de UX como, entrevistas, questionário, Análise de Benchmarking e testes de usabilidade para, assim, projetar uma arquitetura visual mais eficiente na transmissão de informação aos usuários e que atende de fato as suas demandas.

    Um pouco sobre a identidade visual do aplicativo => Link

    Protótipo => Link

    If you would like to check the UX process go to this link –> Breve 😀

    This is a personal project about mobile aplications for public transport. I had my own experience, as a user, to kick start the project. It all started from the main questioning,”How can a person who is unaware of a certain location capable of getting around it?”. So a brief research with users took place based on UX Designig process to design a more efficient visual architecture to communicate to users all information that actually meets their demands.

    Processo

    1. Passo – Pesquisa exploratória
    2. Através de entrevistas, questionário e um mapa de empatia foi possível levantar algumas informações sobre como os usuários se sentem frente às dificuldades que enfrentam na utilização do sistema de transporte público. Sentimentos de insegurança, frustração e ansiedade foram expostos pela maioria dos entrevistados. Dentre as principais dificuldades relatadas estavam a dificuldade de acesso a informações sobre o sistema e orientações sobre trajetos. Na maioria das vezes os usuários recorrem a outros usuários para ter tais informações.
    3. Passo – Análise de Benchmark
    4. Foram selecionados para a análise os aplicativos ofertados pelas empresas de transporte público da cidade do Rio de Janeiro, Manaus e Vitória. De modo geral os aplicativos não apresentavam um fluxo de navegação muito claro bem como uma hierarquia visual bem prejudicada. Mesmo assim esses aplicativos apresentavam alguns recursos de relevância para os usuários. O aplicativo referencial para este projeto passou a ser o Moovit, pois apresentava todos os recursos apresentados pelos aplicativos anteriores e mais alguns, como, monitoramento de trajeto. O único recurso ausente é o tempo real de previsão dos veículos.
    5. Passo – Personas e Definição de requisitos
    6. A partir da elaboração de perfis de Personas(5 personas) as quais comportassem as principais características dos usuários de transporte público, além disso se deu a construção de alguns cenários de caso de uso, assim, a partir da análise desses passos foi possível levantar a relação de objetivos que o usuário tem ao utilizar o sistema.

    Interface de usuário

    A partir da elaboração da relação de objetivos de usuários foi feito o fluxo de telas em baixa fidelidade, a construção de um protótipo inicial o qual foi submetido a alguns testes com usuário e a partir das informações coletadas foram implementadas novas alterações e novos testes subsequentes. Esse é o resultado obtido até o momento.

    Principais Contribuições

    Uma das principais contribuições foi o redesign da informação das telas de rotas disponíveis e direções, onde por meio de uma estrutura formal de um diagrama de processo. Dessa forma a numeração reforça a ideia de escolhas disponíveis(quantidade) e a sucessão de estágios de um processo(direção). A aplicação de cores como uma camada de informação reforçando os diferentes estágios da viagem. Todas essas estratégias têm como objetivo tornar mais clara e intuitiva a navegação do usuário pelas informações. A possibilidade de compartilhar localização e construir rotas personalizadas ajuda a reduzir a chance dos usuários se perderem ao desenvolver novos trajetos, pois se trata de uma fonte de informação confiável.

    Implementação

    Visit original content creator repository