We are proud to announce the release of OSINT LIAR 1.5.0. This release is packed with features:
OSINT LIAR is free for personal or for use by non-profits
Usage no longer requires an API key, when using it for personal use
Audit logging was added to track changes when captures are shared between people
Improved Integration with existing OSINT Tools
Created a Discovery Plugin repository for extending the functionality of OSINT LIAR https://github.com/osint-liar/discovery-plugins
Fixed bugs in the Origin Permissions Access controls
Fixed bugs with single access requests
Improved text search algorithm
Simplified the install process
Security Scans
OSINT LIAR is submitted to Virus Total on each build. To view the analysis by Virus Total go to https://www.virustotal.com/gui/url/966c6884eea7512db940bea4ce5ed31f9a4b6861e4a3fea2b64b4b087a81eb28?nocache=1
We always recommend scanning all exes for malware or viruses.
To try it out, download the Chrome Extension from https://chromewebstore.google.com/detail/osint-liar/mgmandbcdaecndcllojppohiekdbhcie
Who Am I, a chrome extension for username enumeration has a new release. Who Am I now lets you right click on highlighted text and utilizes the “Whats My Name” and “Sherlock” sites list for searching across hundreds of social media sites. Who Am I’s, search functionality has been improved so you can quickly reduce the number of false positive matches by entering additional text about your POI.
Get the chrome extension free from https://chromewebstore.google.com/detail/who-am-i/gdnhlhadhgnhaenfcphpeakdghkccfoo
Need an Enterprise grade OSINT collection tool for collecting webpage, images, videos and files? Checkout out OSINT LIAR, it does more than just collect, it saves you time, preserves your content, and integrates with AI/ML service providers. OSINT LIAR transforms your data into information with our data visualizations tools.
OSINT LIAR uses a RESTful API approach for saving and retrieving your data. To protect your data we use tokens. OSINT LIAR “Application Key Tokens” are a string of alphanumeric and other characters. They are not encoded, at this time. You MUST provide this token to the application to you want to provide authorization to. If token does not exist or is an incorrect token, the request is blocked by the OSINT LIAR application. Each token is randomly generated as to prevent attacks against fixed token values.
Most Open Source Intelligence collection techniques rely on your web browser for accessing and acquiring data. Chrome Extensions play a major role in helping extract and archive data from web pages. Extensions provide you with the “easy” button, when it comes collecting the data. User Snippets are multiline Javascript functions that you can run directly against web pages that you are trying to extract data from.
Chrome Extensions: Are pre-made software programs that run within your browser. Extensions can be installed through the chrome store or loaded directly from your file system. Extension allow you to customize the look, feel, interactivity and functionality during your web browsing session. The Chrome store has over 180K extensions available for download. If you can think it, there is a chrome extension for it.
User Snippets: User snippets in Chrome are a feature of the Chrome Developer Tools (DevTools) that allow developers to run Javascript programs on demand. User Snippets can function like a recipe book for accessing and acquiring data from a variety of web sites. They can easily be reused across different websites, that use the same underlying technology. The best part about User Snippets is they can readily be modified and updated at anytime.
OSINT LIAR is a highly adaptive next generation collection tool. Get your free trial today at https://osintliar.com
User Snippet: Expand YouTube Comments And Save Using OSINT LIAR
/**
* OSINT LIAR: Snippet for collecting comments from a YouTube video and save the page and the comments into OSINT LIAR.
* Buy or get a Trial license for OSINT LIAR from https://osintliar.com
* DO NOT REPRODUCE OR COPY WITH PERMISSION
*/
(function expandAllComments() {
let lastScrollHeight = -1;
let saved = false;
let lastNodeCount = 0;
let tries = 0;
const EXTENSION_ID = 'mgmandbcdaecndcllojppohiekdbhcie';
const distance = 150; // How many pixels to scroll on each step
let scrolled = 0;
let flag = false;
const MAX_TRIES = 40; // Set higher if you have a high latency internet connection or saving is happening too early.
let intervalId = setInterval(() => {
// Find all "Show more" and reply buttons buttons for comments and replies
const buttons = document.querySelectorAll(
'.ytd-item-section-renderer, .ytd-continuation-item-renderer, .ytd-replies-alt, .more-button'
);
// Click each "Show more" button if it's not already clicked
buttons.forEach(button => {
if (isButtonVisible(button) && !button.getAttribute('clicked')) {
button.click();
button.setAttribute('clicked', true);
}
});
let exit = false;
if(exit){
flag = true;
console.log('exiting')
}
// Scroll down to load more comments
window.scrollBy(0, distance);
scrolled += distance;
// Check if the scroll position is at the bottom and no more comments to load
if (flag) {
clearInterval(intervalId);
console.log('All comments expanded. Scrolling to the top...');
window.scrollTo(0, 0); // Scroll back to the top of the page
if(!saved){
saved = true;
delay(2000)
// Save the page into OSINT LIAR
chrome.runtime.sendMessage(EXTENSION_ID, {captureTab: true},
function(response) {
console.log(response)
}
);
}
}
else {
previousHeight = lastScrollHeight;
lastScrollHeight = document.documentElement.scrollHeight;
const element = document.getElementById('sections');
let height = element.clientHeight;
flag = (lastScrollHeight > height + 5000 && tries > MAX_TRIES)
if(flag){
console.log("scroll height flag")
}
if(lastNodeCount === element.children.length){
tries++;
}
if(!flag && tries > MAX_TRIES && previousHeight === lastScrollHeight){
flag = true;
console.log("Max tries exceeded.")
}
else if(lastNodeCount !== element.children.length || previousHeight !== lastScrollHeight)
{
lastNodeCount = element.children.length
tries = 0;
}
}
}, 750); // Adjust time interval as needed, depends on your latency
})();
function delay(milliseconds){
return new Promise(resolve => {
setTimeout(resolve, milliseconds);
});
}
function isButtonVisible(button) {
if (!button) {
console.log('Button not found');
return false;
}
const style = window.getComputedStyle(button);
// Check for display property
if (style.display === 'none') return false;
// Check for visibility property
if (style.visibility === 'hidden') return false;
// Check for opacity
if (style.opacity === '0') return false;
// Check if the button has dimensions
if (button.offsetWidth === 0 && button.offsetHeight === 0) return false;
// Check if the button is in the viewport
const rect = button.getBoundingClientRect();
const inViewport = rect.top >= 0 && rect.left >= 0 && rect.bottom <= (window.innerHeight || document.documentElement.clientHeight) && rect.right <= (window.innerWidth || document.documentElement.clientWidth);
return inViewport;
}
For last several years, I have learned a lot about Open Source Intelligence (OSINT) and wanted to share with others my knowledge about this field. My background is in Computer Science and have 20 years of experience at this point. In a series of Blog posts, I will go deeper into each of these capture methods, but for now we will keep it simple. Our example web page is https://osintliar.com. It doesn’t have dynamic content, nor any videos.
In future blog posts, we will go over the problem set involved in live data captures.
Screen Capture
Taking a screenshot of a web page is a quick and straightforward method. It captures the page exactly as it appears at a moment in time, including layout and images. However, it’s static and doesn’t preserve the interactivity or underlying code. Screenshots can be saved in various formats like PNG or JPEG.
Saving As HTML
This method saves the HTML file of the web page. HTML (Hypertext Markup Language) is the standard markup language used to create web pages. When you save a page as HTML, it typically involves saving the basic HTML file along with a folder containing the associated files like images, stylesheets (CSS), and JavaScript files.
Saving As MHTML
MHTML (MIME Encapsulation of Aggregate HTML Documents) is a web page archive format used to combine resources like images, JavaScript, CSS, and HTML into a single file. When you save a page as MHTML, it creates a single file that encapsulates the entire page.
PDF Export
Exporting a web page as a PDF is a useful way to preserve its visual layout and text content. This method is widely supported and convenient for sharing and viewing. However, like screenshots, it creates a static record and doesn’t preserve the interactivity or the full functionality of the web page.
What Works Best?
My preference is for storing web pages as MHTML. They provide the underlying html source code, they do not have Javascript enabled in them. OSINT LIAR stores your web page captures as MHTML on your local computer, in an encrypted database, not in the cloud.
Did you know, OSINT LIAR can store all of these web page captures? If you have a favorite tool for doing your captures, awesome! OSINT LIAR is already compatible with it.
OCR is the process of converting either printed documents or images with words into digital text that can be used for analysis. This is helpful for transforming PDFs that contain images with words into something that can be indexed and made searchable within LIA. Alternatively, OCR is also known as text recognition.
This functionality can be used to translate foreign digital documents into one’s native language.
OCR functionality is not natively built into LIA, but in this article, I will demonstrate how LIA can readily be configured to interact with an online OCR service.
Step 1. Select an image
Let’s grab a meme off the internet using LIA.
Step 2. Push the image to the OCR API
By clicking on the drop down menu in the dashboard of LIA you can select the option to push content to the OCR provider. After clicking the “OCR” option the API immediately responds with the following data.
If the “AutoCapture” feature is enabled this content will be saved into LIA for you.
How to configure
{"ParsedResults":[{"TextOverlay":{"Lines":[{"LineText":"$ I DON'T THINK THAT MEMES","Words":[{"WordText":"$","Left":0.0,"Top":24.0,"Height":23.0,"Width":13.0},{"WordText":"I","Left":24.0,"Top":28.0,"Height":49.0,"Width":13.0},{"WordText":"DON'T","Left":54.0,"Top":27.0,"Height":51.0,"Width":141.0},{"WordText":"THINK","Left":206.0,"Top":28.0,"Height":50.0,"Width":150.0},{"WordText":"THAT","Left":367.0,"Top":28.0,"Height":50.0,"Width":125.0},{"WordText":"MEMES","Left":505.0,"Top":27.0,"Height":51.0,"Width":172.0}],"MaxHeight":51.0,"MinTop":24.0},{"LineText":"WHAT YOU THINK IT MEMES","Words":[{"WordText":"WHAT","Left":19.0,"Top":395.0,"Height":50.0,"Width":147.0},{"WordText":"YOU","Left":177.0,"Top":394.0,"Height":52.0,"Width":96.0},{"WordText":"THINK","Left":287.0,"Top":395.0,"Height":50.0,"Width":150.0},{"WordText":"IT","Left":450.0,"Top":394.0,"Height":51.0,"Width":45.0},{"WordText":"MEMES","Left":508.0,"Top":394.0,"Height":51.0,"Width":172.0}],"MaxHeight":52.0,"MinTop":394.0}],"HasOverlay":true,"Message":"Total lines: 2"},"TextOrientation":"0","FileParseExitCode":1,"ParsedText":"$ I DON'T THINK THAT MEMES\r\nWHAT YOU THINK IT MEMES\r\n","ErrorMessage":"","ErrorDetails":""}],"OCRExitCode":1,"IsErroredOnProcessing":false,"ProcessingTimeInMilliseconds":"328","SearchablePDFURL":"Searchable PDF not generated as it was not requested."}
In just a couple more steps you can add this functionality to your LIA installation. First, sign up for a free account at http://ocr.space/OCRAPI. After getting your API key, you will need to add your key to the API Keys page.
After the API Key is edited you can add the functionality in via editing your JSON config file located at ~/AppData/Local/Lia/webroot/json-plugins.json , don’t worry, this part will get automated sooner than later.
Make sure the json is valid, before saving. If you need help, contact support@bakerstreet.llc. This JSON code maps LIA’s internal API code to the external API made available through https://api.ocr.space/Parse/Image.
The Local Internet Archive or LIA now supports Boolean search. Boolean search can help you narrow or widen your search criteria depending on your needs. LIA supports the following Boolean search options:
NOT
AND
OR
These provide the ability to focus a search, particularly when your topic contains multiple search terms.
For example, let imagine we want to extract out search results based upon the state name. One query may look like this:
“MICHIGAN NOT FLORIDA”
This gives all the results that contain the keyword “MICHIGAN” but does not contain the word “FLORIDA”. This is helpful for narrowing the results of a search space.
If we needed to widen our scope and include all results that have either “MICHIGAN” or “FLORIDA” we would use the following query.
“MICHIGAN NOT FLORIDA”
In our last example, we only want to get results that have “MICHIGAN” and “FLORIDA”. We would use the next query.
“MICHIGAN AND FLORIDA”
Using this search phrase will restrict the results to search hits that contain both “MICHIGAN” and “FLORIDA”.
Complex Boolean Searches
It is possible to chain Boolean searches together to expand or narrow the search space even more.
Here is a good example that uses 2 Boolean operators.
“MICHIGAN AND FLORIDA NOT OHIO”
These can be continuously chained to together to narrow or wide the scope of your search.