Improvement/stricter serp api enrichment filter #2670

epipav · 2024-11-04T10:11:14Z

Changes proposed ✍️

What

copilot:summary

copilot:poem

Why

How

copilot:walkthrough

Checklist ✅

Label appropriately with Feature, Improvement, or Bug.
Add screenshots to the PR description for relevant FE changes
New backend functionality has been unit-tested.
API documentation has been updated (if necessary) (see docs on API documentation).
Quality standards are met.

Summary by CodeRabbit

New Features
- Introduced credit management checks for member enrichment, ensuring data is fetched only if sufficient credits are available.
- Added new properties and methods to enhance criteria for member enrichment based on activity levels.
- Implemented Redis caching for optimized credit checks and data retrieval.
- New method to check remaining credits added to multiple enrichment services.
- Added a function to identify the member with the most activity on a platform.
Bug Fixes
- Updated activity count threshold for member enrichment to improve accuracy.
Documentation
- Added new interface for account usage data to enhance clarity on available resources.
- Updated interfaces to reflect changes in pagination and member enrichment logic.

…ty count filter to clearbit

coderabbitai · 2024-11-04T10:11:22Z

Walkthrough

The pull request introduces several new asynchronous functions and modifications related to credit management for member enrichment sources. Key changes include the addition of functions to check and manage credit availability using a Redis cache. The EnrichmentServiceClearbit, EnrichmentServiceProgAI, and EnrichmentServiceSerpApi classes have been updated to include a hasRemainingCredits method, enhancing the criteria for member enrichment based on activity levels and ensuring that enrichment data is only fetched if sufficient credits are available.

Changes

File	Change Summary
`services/apps/premium/members_enrichment_worker/src/activities/enrichment.ts`	Added functions: `setHasRemainingCredits`, `getHasRemainingCredits`, `hasRemainingCreditsExists`, `hasRemainingCredits`. Modified `getEnrichmentData` to check for remaining credits.
`services/apps/premium/members_enrichment_worker/src/sources/clearbit/service.ts`	Updated `enrichableBySql` property and added `enrichMembersWithActivityMoreThan`. Added method `hasRemainingCredits`.
`services/apps/premium/members_enrichment_worker/src/sources/progai/service.ts`	Added method `hasRemainingCredits`. No other changes.
`services/apps/premium/members_enrichment_worker/src/sources/serp/service.ts`	Updated `enrichMembersWithActivityMoreThan` to 500 and added method `hasRemainingCredits` to check credits via GET request.
`services/apps/premium/members_enrichment_worker/src/sources/serp/types.ts`	Added new interface `ISerpApiAccountUsageData` for account usage properties.
`services/apps/premium/members_enrichment_worker/src/types.ts`	Added method `hasRemainingCredits` to `IEnrichmentService` interface.
`services/libs/redis/src/cache.ts`	Added method `exists` to `RedisCache` class to check key existence in Redis cache.
`services/apps/premium/members_enrichment_worker/src/activities.ts`	Added import and export for `findMemberIdentityWithTheMostActivityInPlatform`.
`services/apps/premium/members_enrichment_worker/src/activities/getMembers.ts`	Updated `getEnrichableMembers` function to replace `afterId` with `afterCursor`.
`services/apps/premium/members_enrichment_worker/src/workflows/enrichMember.ts`	Modified logic to utilize `findMemberIdentityWithTheMostActivityInPlatform` for selecting GitHub identities.
`services/apps/premium/members_enrichment_worker/src/workflows/getMembersToEnrich.ts`	Updated `getMembersToEnrich` to use `afterCursor` instead of `afterId`.
`services/libs/data-access-layer/src/activities/sql.ts`	Added `findMemberIdentityWithTheMostActivityInPlatform` function for querying activities.
`services/libs/data-access-layer/src/old/apps/premium/members_enrichment_worker/index.ts`	Updated `fetchMembersForEnrichment` to use `afterCursor` instead of `afterId`.
`services/libs/types/src/premium/enrichment.ts`	Added new interface `IEnrichableMemberIdentityActivityAggregate`.

Possibly related PRs

Member enrichment null byte sanitization, retry policy updates #2661: This PR introduces modifications to the IEnrichmentService interface, adding a property related to caching, which is relevant to the caching functionality introduced in the main PR.
Serp linkedin finder enrichment source #2664: This PR adds a new enrichment source related to LinkedIn, which may interact with the member enrichment logic and credit management changes in the main PR.

Suggested labels

Feature

Suggested reviewers

sausage-todd

🐰 In the meadow where the bunnies play,
New credits check brighten the day.
With Redis caching, swift and neat,
Enrichment flows, a joyful feat!
Hopping high with data so bright,
Our members enriched, what a delight! 🌼

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (10)

services/apps/premium/members_enrichment_worker/src/sources/serp/types.ts (2)
10-26: Mark sensitive fields with a comment.

The interface structure is well-defined and comprehensive. However, consider marking sensitive fields to ensure proper handling throughout the codebase.
 export interface ISerpApiAccountUsageData {
   account_id: string
+  /** @sensitive */
   api_key: string
+  /** @sensitive */
   account_email: string
   account_status: string
   plan_id: string
10-26: Add JSDoc comments to document the interface purpose and key fields.

Consider adding documentation to improve code maintainability and help other developers understand the purpose of this interface and its critical fields.
+/**
+ * Represents SERP API account usage data and limits.
+ * Used for managing API credits and rate limiting.
+ */
 export interface ISerpApiAccountUsageData {
   account_id: string
   api_key: string
   account_email: string
   account_status: string
   plan_id: string
   plan_name: string
   plan_monthly_price: number
+  /** Maximum number of searches allowed per month under current plan */
   searches_per_month: number
+  /** Remaining searches available in the current plan */
   plan_searches_left: number
+  /** Additional credits beyond the plan limit */
   extra_credits: number
+  /** Total remaining searches (plan + extra credits) */
   total_searches_left: number
+  /** Number of searches used in the current month */
   this_month_usage: number
+  /** Number of searches performed in the current hour */
   this_hour_searches: number
+  /** Number of searches performed in the previous hour */
   last_hour_searches: number
+  /** Maximum number of searches allowed per hour */
   account_rate_limit_per_hour: number
 }
services/apps/premium/members_enrichment_worker/src/types.ts (2)
40-42: LGTM! Consider enhancing the documentation.

The method signature and purpose are well-defined. However, consider adding documentation about:

Error handling behavior

Whether the implementation should retry on failure

What happens when the credit check service is unavailable

41-41: Consider extracting the cache duration to a constant.

The 60-second cache duration is hardcoded in the comment. Consider extracting this value to a named constant to make it configurable and maintain it in a single place.
+// Duration in seconds for caching credit check results
+export const CREDIT_CHECK_CACHE_DURATION = 60;
services/apps/premium/members_enrichment_worker/src/sources/serp/service.ts (1)

Line range hint 21-64: Consider implementing a more flexible enrichment strategy.

The current changes make the enrichment more restrictive through two mechanisms:

A high fixed activity threshold (500)

Credit availability checks

Consider these architectural improvements:

Make the activity threshold configurable through environment variables

Implement a priority queue for enrichment requests based on:

Member activity level

Last enrichment attempt

Available credits

Add metrics/monitoring for:

Credit usage patterns

Enrichment success rates

API response times

This would provide more flexibility and visibility into the enrichment process.
services/libs/redis/src/cache.ts (1)
46-50: LGTM! The implementation is correct and consistent.

The new exists method follows the established patterns in the class:

Uses the prefixer for key consistency

Correctly interprets Redis EXISTS command response (1 = exists, 0 = doesn't exist)

Maintains the async/await pattern used throughout the class

Consider adding error handling consistent with other methods:
 async exists(key: string): Promise<boolean> {
+  try {
     const actualKey = this.prefixer(key)
     const value = await this.client.exists(actualKey)
     return value === 1
+  } catch (error) {
+    this.log.error('Failed to check key existence', { error, key })
+    throw error
+  }
 }
Consider adding JSDoc documentation for better API visibility:
+/**
+ * Checks if a key exists in the cache
+ * @param key - The key to check
+ * @returns Promise resolving to true if the key exists, false otherwise
+ */
 async exists(key: string): Promise<boolean> {
services/apps/premium/members_enrichment_worker/src/sources/clearbit/service.ts (1)
30-31: Consider making the activity threshold configurable.

The hardcoded activity threshold of 10 should be moved to a configuration file or environment variable for better maintainability and flexibility across different environments.
-  public enrichMembersWithActivityMoreThan = 10
+  public enrichMembersWithActivityMoreThan = Number(process.env.CROWD_ENRICHMENT_MIN_ACTIVITY_THRESHOLD ?? 10)
services/apps/premium/members_enrichment_worker/src/activities/enrichment.ts (3)
56-66: Simplify the setHasRemainingCredits Function

The current implementation uses a conditional statement to set the cache value to 'true' or 'false'. This can be simplified by directly converting the boolean hasCredits to a string.

Apply this refactor to streamline the function:
export async function setHasRemainingCredits(
  source: MemberEnrichmentSource,
  hasCredits: boolean,
): Promise<void> {
  const redisCache = new RedisCache(`enrichment-${source}`, svc.redis, svc.log)
- if (hasCredits) {
-   await redisCache.set('hasRemainingCredits', 'true', 60)
- } else {
-   await redisCache.set('hasRemainingCredits', 'false', 60)
- }
+ await redisCache.set('hasRemainingCredits', hasCredits.toString(), 60)
}
73-76: Eliminate Redundant Existence Check

The hasRemainingCreditsExists function separately checks for the existence of the key, but getHasRemainingCredits can inherently handle missing keys.

Consider removing hasRemainingCreditsExists and modifying hasRemainingCredits to rely solely on the result of getHasRemainingCredits:
- if (await hasRemainingCreditsExists(source)) {
+ const cachedValue = await getHasRemainingCredits(source)
+ if (cachedValue !== null) {
    return cachedValue
}
85-87: Consider Making Cache Expiration Time Configurable

The expiration time for the cache is hardcoded to 60 seconds. Depending on how frequently the credit availability changes, a different duration might be more appropriate.

Introduce a configurable constant or parameter for the cache TTL:
+ const CACHE_TTL_SECONDS = 60; // Adjust as needed

export async function setHasRemainingCredits(
  source: MemberEnrichmentSource,
  hasCredits: boolean,
): Promise<void> {
  // ...
- await redisCache.set('hasRemainingCredits', hasCredits.toString(), 60)
+ await redisCache.set('hasRemainingCredits', hasCredits.toString(), CACHE_TTL_SECONDS)
}
This enhances maintainability and allows easy adjustments in the future.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 6875221 and 9d44c1c.

📒 Files selected for processing (7)

services/apps/premium/members_enrichment_worker/src/activities/enrichment.ts (3 hunks)
services/apps/premium/members_enrichment_worker/src/sources/clearbit/service.ts (2 hunks)
services/apps/premium/members_enrichment_worker/src/sources/progai/service.ts (1 hunks)
services/apps/premium/members_enrichment_worker/src/sources/serp/service.ts (2 hunks)
services/apps/premium/members_enrichment_worker/src/sources/serp/types.ts (1 hunks)
services/apps/premium/members_enrichment_worker/src/types.ts (1 hunks)
services/libs/redis/src/cache.ts (1 hunks)

🔇 Additional comments (8)

services/apps/premium/members_enrichment_worker/src/sources/serp/service.ts (2)

12-16: LGTM: Import changes are well-organized.

The new type import is properly grouped with related SERP API types.

21-21: Verify the impact of increased activity threshold.

The threshold increase from 10 to 500 activities is substantial and could significantly affect member enrichment patterns. Let's verify the impact:

services/libs/redis/src/cache.ts (1)

46-50: Verify the integration with credit management system.

Let's confirm how this new method is being used in the credit management system.

services/apps/premium/members_enrichment_worker/src/sources/clearbit/service.ts (1)

32-32: LGTM! Enhanced member filtering criteria.

The SQL condition effectively combines activity threshold with email verification, implementing stricter filtering as intended in the PR objectives.

services/apps/premium/members_enrichment_worker/src/sources/progai/service.ts (1)

127-129: ⚠️ Potential issue

Implementation needs actual credit checking logic.

The current implementation of hasRemainingCredits always returns true, which effectively bypasses any credit management. This doesn't align with the PR's objective of implementing stricter filtering.

Consider implementing proper credit checking logic:

Integrate with the Redis cache for credit tracking
Add error handling for API/cache failures
Include unit tests for various credit scenarios

Let's verify if other enrichment services have proper implementations:

services/apps/premium/members_enrichment_worker/src/activities/enrichment.ts (3)

7-7: Importing RedisCache for Caching Functionality

The addition of RedisCache import is appropriate for implementing caching mechanisms in the subsequent functions.

31-31: Ensuring Credit Availability Before Data Retrieval

The updated condition in getEnrichmentData now checks (await hasRemainingCredits(source)) before fetching data. This ensures that enrichment data is only retrieved when there are sufficient credits, optimizing resource usage.

85-87: Verify the Appropriateness of Cache TTL

The 60-second cache expiration might not align with the actual frequency of credit updates from the enrichment services.

Consider confirming the update interval of credit information from the enrichment services. If the credits update less frequently, extending the cache TTL can reduce unnecessary service calls.

Would you like assistance in determining an optimal cache duration based on the service's update frequency?

coderabbitai · 2024-11-04T10:20:29Z

services/apps/premium/members_enrichment_worker/src/types.ts

+  // does the source have credits to enrich members, if returned false the source will be skipped
+  // response will be saved to redis for 60 seconds and will be used for subsequent calls
+  hasRemainingCredits(): Promise<boolean>


💡 Codebase verification

Implementation check reveals potential issues with credit validation

The verification shows that:

SERP API service has a proper implementation checking the API account status

Both Clearbit and ProgAI services have placeholder implementations that always return true

This could lead to unnecessary API calls even when credits are exhausted

• services/apps/premium/members_enrichment_worker/src/sources/clearbit/service.ts: Implement proper credit checking
• services/apps/premium/members_enrichment_worker/src/sources/progai/service.ts: Implement proper credit checking

🔗 Analysis chain

Verify implementation across all enrichment services.

Let's ensure all enrichment services properly implement the new hasRemainingCredits method.

Let me try a different approach to search for the implementations.

Let me check specifically for the hasRemainingCredits implementation in these identified service files.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check implementation of hasRemainingCredits across enrichment services # Test: Search for class implementations of enrichment services echo "Searching for enrichment service implementations..." ast-grep --pattern 'class $_ implements IEnrichmentService { $$$ }' # Test: Verify hasRemainingCredits implementation in each service echo "Verifying hasRemainingCredits implementation..." ast-grep --pattern 'async hasRemainingCredits(): Promise<boolean> { $$$ }'

Length of output: 368

Script:

#!/bin/bash # Search for files containing IEnrichmentService echo "Files referencing IEnrichmentService:" rg "IEnrichmentService" -l # Search for class implementations and hasRemainingCredits echo -e "\nImplementations and method details:" rg "class.*implements.*IEnrichmentService" -A 20

Length of output: 8082

Script:

#!/bin/bash # Check for hasRemainingCredits implementation in each service file echo "Checking SERP implementation:" rg "hasRemainingCredits" services/apps/premium/members_enrichment_worker/src/sources/serp/service.ts -A 5 echo -e "\nChecking Clearbit implementation:" rg "hasRemainingCredits" services/apps/premium/members_enrichment_worker/src/sources/clearbit/service.ts -A 5 echo -e "\nChecking ProgAI implementation:" rg "hasRemainingCredits" services/apps/premium/members_enrichment_worker/src/sources/progai/service.ts -A 5

Length of output: 1195

coderabbitai · 2024-11-04T10:20:29Z

services/apps/premium/members_enrichment_worker/src/sources/serp/service.ts

+  async hasRemainingCredits(): Promise<boolean> {
+    const config = {
+      method: 'get',
+      url: `https://serpapi.com/account`,
+      params: {
+        api_key: process.env['CROWD_ENRICHMENT_SERP_API_KEY'],
+      },
+    }
+
+    const response: ISerpApiAccountUsageData = (await axios(config)).data
+
+    return response.total_searches_left > 0
+  }


⚠️ Potential issue

Add error handling and caching for credit checks.

The current implementation has several potential issues:

No error handling for API failures

No caching of credit checks (could hit rate limits)

Direct environment variable access without validation

No rate limiting for credit checks

Consider implementing this improved version:

+ private static CACHE_KEY = 'serp-api-credits'; + private static CACHE_TTL = 300; // 5 minutes async hasRemainingCredits(): Promise<boolean> { + try { + // Check if API key is configured + const apiKey = process.env['CROWD_ENRICHMENT_SERP_API_KEY']; + if (!apiKey) { + this.log.error('SERP API key not configured'); + return false; + } + + // TODO: Add Redis caching here + // const cached = await redis.get(EnrichmentServiceSerpApi.CACHE_KEY); + // if (cached !== null) { + // return cached === 'true'; + // } + const config = { method: 'get', url: `https://serpapi.com/account`, params: { - api_key: process.env['CROWD_ENRICHMENT_SERP_API_KEY'], + api_key: apiKey, }, } const response: ISerpApiAccountUsageData = (await axios(config)).data + const hasCredits = response.total_searches_left > 0; + + // TODO: Cache the result + // await redis.set( + // EnrichmentServiceSerpApi.CACHE_KEY, + // hasCredits.toString(), + // 'EX', + // EnrichmentServiceSerpApi.CACHE_TTL + // ); + - return response.total_searches_left > 0 + return hasCredits; + } catch (error) { + this.log.error('Error checking SERP API credits:', error); + return false; + } }

Would you like me to help implement the Redis caching layer? I can provide the complete implementation or create a GitHub issue to track this enhancement.

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-04T10:20:29Z

services/apps/premium/members_enrichment_worker/src/sources/clearbit/service.ts

+  async hasRemainingCredits(): Promise<boolean> {
+    return true
+  }


💡 Codebase verification

⚠️ Potential issue

Implement actual credit checking logic similar to SERP service

The current implementation always returns true, while other services like SERP implement proper credit checking. For Clearbit service, consider implementing similar logic:

- async hasRemainingCredits(): Promise<boolean> { - return true - } + async hasRemainingCredits(): Promise<boolean> { + try { + const response = await axios.get('https://api.clearbit.com/v1/remaining_credits', { + auth: { + username: process.env['CROWD_ENRICHMENT_CLEARBIT_API_KEY'], + password: '' + } + }); + return response.data.remaining > 0; + } catch (error) { + this.log.error('Failed to check Clearbit credits:', error); + return false; + } + }

The SERP service properly checks remaining credits via their API

The ProgAI service also needs proper implementation

Credit checking is crucial as mentioned in the interface comments: "if returned false the source will be skipped"

🔗 Analysis chain

Implement actual credit checking logic.

The current implementation always returns true, which defeats the purpose of credit management. This should be implemented to actually check Clearbit API credits.

Let's verify if other enrichment services have proper credit checking implemented:

Consider implementing proper credit checking:

- async hasRemainingCredits(): Promise<boolean> { - return true - } + async hasRemainingCredits(): Promise<boolean> { + try { + // Add actual credit checking logic here + // Example: Check Clearbit API quota/usage endpoints + return true + } catch (err) { + this.log.error(`Failed to check Clearbit credits: ${err}`) + return false + } + }

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for hasRemainingCredits implementations in other enrichment services ast-grep --pattern 'async hasRemainingCredits(): Promise<boolean> { $$$ }'

Length of output: 78

Script:

#!/bin/bash # Let's try a broader search for hasRemainingCredits implementations rg -l "hasRemainingCredits" # Then let's look at the interface definition rg "interface.*EnrichmentService" -A 10

Length of output: 1616

Script:

#!/bin/bash # Let's examine the implementations in other services rg "hasRemainingCredits.*Promise<boolean>" -A 5 services/apps/premium/members_enrichment_worker/src/sources/serp/service.ts services/apps/premium/members_enrichment_worker/src/sources/progai/service.ts

Length of output: 1554

Script:

#!/bin/bash # Let's see the complete SERP implementation to understand the credit checking logic rg "hasRemainingCredits.*Promise<boolean>" -A 15 services/apps/premium/members_enrichment_worker/src/sources/serp/service.ts

Length of output: 622

coderabbitai · 2024-11-04T10:20:29Z

services/apps/premium/members_enrichment_worker/src/sources/progai/service.ts

+  async hasRemainingCredits(): Promise<boolean> {
+    return true
+  }


🛠️ Refactor suggestion

Align implementation with existing patterns.

To maintain consistency with the rest of the codebase, consider these improvements:

Add error handling similar to getData methods

Use the logger for monitoring credit checks

Consider adding credit threshold configuration via environment variables

Here's a suggested implementation:

async hasRemainingCredits(): Promise<boolean> { - return true + try { + this.log.debug('Checking ProgAI remaining credits') + // Add credit checking logic here + const hasCredits = true // Replace with actual check + this.log.debug(`ProgAI credits available: ${hasCredits}`) + return hasCredits + } catch (err) { + this.log.error('Error checking ProgAI credits', err) + throw new Error(`Failed to check ProgAI credits: ${err.message}`) + } }

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-04T10:20:29Z

services/apps/premium/members_enrichment_worker/src/activities/enrichment.ts

+export async function getHasRemainingCredits(source: MemberEnrichmentSource): Promise<boolean> {
+  const redisCache = new RedisCache(`enrichment-${source}`, svc.redis, svc.log)
+  return (await redisCache.get('hasRemainingCredits')) === 'true'
+}


⚠️ Potential issue

Handle Missing Cache Entries in getHasRemainingCredits

If the cache key does not exist, redisCache.get may return null, leading to unexpected behavior when comparing with 'true'.

Modify the function to handle null values explicitly:

export async function getHasRemainingCredits(source: MemberEnrichmentSource): Promise<boolean> { const redisCache = new RedisCache(`enrichment-${source}`, svc.redis, svc.log) - return (await redisCache.get('hasRemainingCredits')) === 'true' + const value = await redisCache.get('hasRemainingCredits') + if (value === 'true') { + return true + } else if (value === 'false') { + return false + } else { + return false // or consider throwing an error if appropriate + } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export async function getHasRemainingCredits(source: MemberEnrichmentSource): Promise<boolean> {

const redisCache = new RedisCache(`enrichment-${source}`, svc.redis, svc.log)

return (await redisCache.get('hasRemainingCredits')) === 'true'

}

export async function getHasRemainingCredits(source: MemberEnrichmentSource): Promise<boolean> {

const redisCache = new RedisCache(`enrichment-${source}`, svc.redis, svc.log)

const value = await redisCache.get('hasRemainingCredits')

if (value === 'true') {

return true

} else if (value === 'false') {

return false

} else {

return false // or consider throwing an error if appropriate

}

}

coderabbitai · 2024-11-04T10:20:30Z

services/apps/premium/members_enrichment_worker/src/activities/enrichment.ts

+export async function hasRemainingCredits(source: MemberEnrichmentSource): Promise<boolean> {
+  const service = EnrichmentSourceServiceFactory.getEnrichmentSourceService(source, svc.log)
+
+  if (await hasRemainingCreditsExists(source)) {
+    return getHasRemainingCredits(source)
+  }
+
+  const hasCredits = await service.hasRemainingCredits()
+
+  await setHasRemainingCredits(source, hasCredits)
+  return hasCredits
+}


🛠️ Refactor suggestion

Optimize hasRemainingCredits Function Logic

The current implementation of hasRemainingCredits involves multiple function calls and checks, which can be streamlined for better performance and readability.

Refactor the function as follows:

export async function hasRemainingCredits(source: MemberEnrichmentSource): Promise<boolean> { const cachedValue = await getHasRemainingCredits(source) - if (await hasRemainingCreditsExists(source)) { - return getHasRemainingCredits(source) - } + if (cachedValue !== null) { + return cachedValue + } const service = EnrichmentSourceServiceFactory.getEnrichmentSourceService(source, svc.log) const hasCredits = await service.hasRemainingCredits() await setHasRemainingCredits(source, hasCredits) return hasCredits }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export async function hasRemainingCredits(source: MemberEnrichmentSource): Promise<boolean> {

const service = EnrichmentSourceServiceFactory.getEnrichmentSourceService(source, svc.log)

if (await hasRemainingCreditsExists(source)) {

return getHasRemainingCredits(source)

}

const hasCredits = await service.hasRemainingCredits()

await setHasRemainingCredits(source, hasCredits)

return hasCredits

}

export async function hasRemainingCredits(source: MemberEnrichmentSource): Promise<boolean> {

const cachedValue = await getHasRemainingCredits(source)

if (cachedValue !== null) {

return cachedValue

}

const service = EnrichmentSourceServiceFactory.getEnrichmentSourceService(source, svc.log)

const hasCredits = await service.hasRemainingCredits()

await setHasRemainingCredits(source, hasCredits)

return hasCredits

}

…hub.com:CrowdDotDev/crowd.dev into improvement/stricter-serp-api-enrichment-filter

…embers with most activities

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (8)

services/libs/types/src/premium/enrichment.ts (1)
28-32: LGTM! Consider adding JSDoc comments.

The new interface is well-structured and serves its purpose in aggregating activity data for member identities. The property types are appropriate for tracking activity metrics across platforms.

Consider adding JSDoc comments to document the interface's purpose and properties:
+/**
+ * Aggregates activity metrics for a member's identity on a specific platform.
+ */
 export interface IEnrichableMemberIdentityActivityAggregate {
+  /** Number of activities performed by this identity */
   activityCount: number
+  /** Username of the identity on the platform */
   username: string
+  /** Platform where the activities were performed */
   platform: string
 }
services/apps/premium/members_enrichment_worker/src/activities/getMembers.ts (1)
14-14: LGTM! Consider adding JSDoc documentation.

The switch to cursor-based pagination with activity count is a good improvement. Consider adding JSDoc documentation to explain the cursor structure and its role in member prioritization.
+/**
+ * Fetches members eligible for enrichment, ordered by activity count.
+ * @param limit Maximum number of members to fetch
+ * @param sources Array of enrichment sources to consider
+ * @param afterCursor Pagination cursor containing activity count and member ID, or null for first page
+ * @returns Promise resolving to array of enrichable members
+ */
 export async function getEnrichableMembers(
services/apps/premium/members_enrichment_worker/src/workflows/getMembersToEnrich.ts (2)
57-60: Add safety check for empty members array

While the current implementation works correctly with the defined batch size, consider adding a safety check before accessing the last array element to make the code more robust against potential configuration changes.
+ if (members.length === 0) {
+   return;
+ }
  await continueAsNew<typeof getMembersToEnrich>({
    afterCursor: {
      memberId: members[members.length - 1].id,
      activityCount: members[members.length - 1].activityCount,
    },
  })
Line range hint 35-52: Well-structured workflow implementation

The workflow follows Temporal best practices with:

Proper child workflow isolation using unique workflowIds

Well-configured retry policies with exponential backoff

Appropriate timeout settings

Proper tenant isolation using searchAttributes
services/apps/premium/members_enrichment_worker/src/workflows/enrichMember.ts (1)
59-81: Improve code structure while maintaining the enhanced logic.

While the logic for selecting the most active GitHub identity is sound, the implementation could be more efficient and maintainable.

Consider this refactoring:
-      // there can be multiple verified identities in github, we select the one with the most activities
-      const verifiedGithubIdentities = input.identities.filter(
-        (i) =>
-          i.verified &&
-          i.platform === PlatformType.GITHUB &&
-          i.type === MemberIdentityType.USERNAME,
-      )
-
-      if (verifiedGithubIdentities.length > 1) {
-        const ghIdentityWithTheMostActivities =
-          await findMemberIdentityWithTheMostActivityInPlatform(input.id, PlatformType.GITHUB)
-        if (ghIdentityWithTheMostActivities) {
-          enrichmentInput.github = input.identities.find(
-            (i) =>
-              i.verified &&
-              i.platform === PlatformType.GITHUB &&
-              i.type === MemberIdentityType.USERNAME &&
-              i.value === ghIdentityWithTheMostActivities.username,
-          )
-        }
-      } else {
-        enrichmentInput.github = verifiedGithubIdentities?.[0] || undefined
-      }
+      // Select the GitHub identity with the most activities
+      const verifiedGithubIdentities = input.identities.filter(
+        (i) => i.verified && 
+              i.platform === PlatformType.GITHUB && 
+              i.type === MemberIdentityType.USERNAME
+      );
+      
+      try {
+        enrichmentInput.github = verifiedGithubIdentities.length > 1
+          ? await findMemberIdentityWithTheMostActivityInPlatform(input.id, PlatformType.GITHUB)
+              .then(mostActive => mostActive 
+                ? verifiedGithubIdentities.find(i => i.value === mostActive.username)
+                : verifiedGithubIdentities[0]
+              )
+          : verifiedGithubIdentities[0] || undefined;
+
+        console.debug(
+          `Selected GitHub identity: ${enrichmentInput.github?.value} from ${verifiedGithubIdentities.length} candidates`
+        );
+      } catch (error) {
+        console.error('Error finding most active GitHub identity:', error);
+        enrichmentInput.github = verifiedGithubIdentities[0] || undefined;
+      }
This refactoring:

Simplifies the nested conditionals

Adds error handling

Includes logging for debugging

Uses a more functional approach
services/apps/premium/members_enrichment_worker/src/activities/enrichment.ts (1)
126-131: Consider adding JSDoc for better documentation

While the function name is descriptive, adding JSDoc would help document the purpose and parameters more clearly.
+/**
+ * Finds the member identity with the highest activity count for a given platform
+ * @param memberId - The ID of the member to search for
+ * @param platform - The platform to filter activities by
+ * @returns Promise resolving to the member identity with activity aggregate
+ */
 export async function findMemberIdentityWithTheMostActivityInPlatform(
   memberId: string,
   platform: string,
 ): Promise<IEnrichableMemberIdentityActivityAggregate> {
services/libs/data-access-layer/src/old/apps/premium/members_enrichment_worker/index.ts (2)
66-76: Consider adding an index to optimize the JSON aggregation

While the JSON aggregation of member identities is well-structured, this operation could be expensive for members with many identities.

Consider adding a composite index on (memberId, platform, value, type, verified) to optimize the aggregation performance.

Cursor-based pagination change requires attention

The verification revealed two different pagination implementations:

getEnrichableMembers in activities/getMembers.ts correctly uses the new cursor-based pagination with activityCount and memberId

getMembersForLFIDEnrichment workflow still uses the old ID-based pagination with afterId

Issues found:

getMembersForLFIDEnrichment workflow needs to be updated to use the new cursor structure

getLFIDEnrichableMembers activity (referenced but implementation not shown) likely needs to be updated to match the new pagination pattern

🔗 Analysis chain

Line range hint 19-91: Verify API consumers are updated for the new cursor structure

The change from ID-based to cursor-based pagination is a breaking change. Let's verify that all consumers have been updated.

Let me try to find any references to this function and related files to better understand the API usage patterns.

I need to see how the cursor is being used in the actual implementations to verify if there are any compatibility issues.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for direct calls to fetchMembersForEnrichment
rg -l 'fetchMembersForEnrichment.*afterId'

# Search for any remaining ID-based pagination patterns
rg -l 'afterId.*members.*enrichment'
Length of output: 83

Script:
#!/bin/bash
# Search for any usage of fetchMembersForEnrichment function
rg -A 5 'fetchMembersForEnrichment'

# Search for files containing 'members' and 'enrichment' in their path
fd -t f 'members.*enrichment|enrichment.*members'
Length of output: 2285

Script:
#!/bin/bash
# Let's check the actual implementation of getMembers activity to see how it handles the cursor
cat services/apps/premium/members_enrichment_worker/src/activities/getMembers.ts

# Also check the workflow implementation that might show cursor handling
cat services/apps/premium/members_enrichment_worker/src/workflows/lf-auth0/getMembersForLFIDEnrichment.ts
Length of output: 2804

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between aa80955 and 5e47a45.

📒 Files selected for processing (9)

services/apps/premium/members_enrichment_worker/src/activities.ts (2 hunks)
services/apps/premium/members_enrichment_worker/src/activities/enrichment.ts (4 hunks)
services/apps/premium/members_enrichment_worker/src/activities/getMembers.ts (2 hunks)
services/apps/premium/members_enrichment_worker/src/types.ts (2 hunks)
services/apps/premium/members_enrichment_worker/src/workflows/enrichMember.ts (2 hunks)
services/apps/premium/members_enrichment_worker/src/workflows/getMembersToEnrich.ts (2 hunks)
services/libs/data-access-layer/src/activities/sql.ts (2 hunks)
services/libs/data-access-layer/src/old/apps/premium/members_enrichment_worker/index.ts (2 hunks)
services/libs/types/src/premium/enrichment.ts (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

services/apps/premium/members_enrichment_worker/src/types.ts

🔇 Additional comments (14)

services/apps/premium/members_enrichment_worker/src/activities/getMembers.ts (1)

26-26: Verify the impact of cursor-based pagination change.

This is a breaking change that requires updates in all code that calls this function. Let's verify that all callers have been updated to use the new cursor structure.

✅ Verification successful

Let me analyze the results and generate additional verification scripts to get more context about the changes.

Let me generate additional scripts to check the workflow implementation and any other potential callers.

Based on the comprehensive evidence gathered, I can now provide the final response:

The cursor-based pagination change is properly implemented across the codebase

The changes are consistent throughout the codebase:

The getEnrichableMembers function signature change is properly reflected in its only caller, the getMembersToEnrich workflow
The workflow correctly handles the new cursor structure in both consuming and generating the cursor for continuation
The underlying fetchMembersForEnrichment implementation in the data-access-layer also supports the new cursor structure
No other callers or legacy implementations were found that need updating

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all usages of getEnrichableMembers to verify they've been updated
# to use the new cursor structure

# Search for function calls
echo "Searching for getEnrichableMembers usage:"
rg -A 5 "getEnrichableMembers\(" 

# Search for potential old-style afterId usage that might need updating
echo -e "\nChecking for potential missed updates using old 'afterId':"
rg "afterId.*getEnrichableMembers"

Length of output: 1649

Script:

#!/bin/bash
# Let's verify the cursor structure changes by:
# 1. Looking at the git history of the file to see the changes
# 2. Checking all imports of getMembers.ts
# 3. Looking for any other pagination-related code

# Check git history for the file
echo "Git history of getMembers.ts:"
git log -p services/apps/premium/members_enrichment_worker/src/activities/getMembers.ts

# Find all files importing getMembers
echo -e "\nFiles importing getMembers:"
rg -l "from ['\"].*getMembers['\"]"

# Look for cursor-related code
echo -e "\nCursor-related code:"
rg -B2 -A2 "afterCursor|cursor" services/apps/premium/members_enrichment_worker/src/

Length of output: 13160

Script:

#!/bin/bash
# Let's check the workflow implementation and any other potential callers

# Check the workflow implementation
echo "Content of getMembersToEnrich workflow:"
cat services/apps/premium/members_enrichment_worker/src/workflows/getMembersToEnrich.ts

# Look for any other potential callers of getEnrichableMembers
echo -e "\nSearching for any other imports or references to getEnrichableMembers:"
rg -l "getEnrichableMembers"

# Check the fetchMembersForEnrichment implementation for cursor handling
echo -e "\nLooking for fetchMembersForEnrichment implementation:"
rg -A 10 "fetchMembersForEnrichment.*\("

Length of output: 4389

services/apps/premium/members_enrichment_worker/src/workflows/getMembersToEnrich.ts (2)

22-22: LGTM: Cursor initialization looks good

The change from afterId to afterCursor with proper null fallback is implemented correctly.

29-29: Verify getEnrichableMembers signature update

The function call has been updated to use the new cursor parameter. Let's verify that the corresponding function signature in the activities has been updated accordingly.

✅ Verification successful

Function signature is correctly updated with cursor parameter ✅

The getEnrichableMembers function in services/apps/premium/members_enrichment_worker/src/activities/getMembers.ts has the correct signature that matches the updated call:

getEnrichableMembers(
  limit: number,
  sources: MemberEnrichmentSource[],
  afterCursor: { activityCount: number; memberId: string } | null,
)

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the getEnrichableMembers function signature in activities
# Expected: Should find the updated function signature with afterCursor parameter

ast-grep --pattern 'getEnrichableMembers = async (limit: number, sources: MemberEnrichmentSource[], afterCursor: $_)'

Length of output: 119

Script:

#!/bin/bash
# Let's try a broader search to find the getEnrichableMembers function definition
# First, search for the file containing the function
rg -l "getEnrichableMembers"

# Then let's see the actual implementation with context
rg -A 10 "export const getEnrichableMembers"

Length of output: 302

Script:

#!/bin/bash
# Let's check the content of the files we found
cat services/apps/premium/members_enrichment_worker/src/activities/getMembers.ts

Length of output: 1086

services/apps/premium/members_enrichment_worker/src/workflows/enrichMember.ts (2)

22-22: LGTM: Activity proxy addition is well-configured.

The new activity follows the established pattern with appropriate timeout and retry configuration.

59-81: Verify the impact of GitHub identity selection changes.

The new logic for selecting GitHub identities based on activity count might affect downstream processes that expect consistent identity selection.

Let's verify the usage:

✅ Verification successful

Based on the search results, I can now provide a final response regarding the GitHub identity selection changes:

GitHub identity selection changes are safe and properly handled

The new logic for selecting GitHub identities based on activity count is well-contained within the enrichment workflow and follows existing patterns. The code shows that:

GitHub identity is only used as an enrichment input source and doesn't affect downstream processes
The identity selection is scoped to the enrichment process and stored in enrichmentInput.github
No other components directly depend on or make assumptions about which GitHub identity is selected

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for code that depends on GitHub identity selection
echo "Searching for downstream usage of enriched GitHub identities..."

# Look for references to github identity in enrichment results
rg -l "enrichmentInput\.github" --type ts

# Look for potential consumers of enriched data
rg -l "MemberEnrichmentSource\.GITHUB" --type ts

Length of output: 309

Script:

#!/bin/bash
# Let's search for broader patterns to understand the GitHub identity usage

# Search for references to github identity in type definitions and interfaces
rg -A 5 "github.*:" --type ts

# Look for any code that processes enriched member data
rg -A 5 "enriched.*member" -i --type ts

# Search for potential database operations or API endpoints using member data
rg -A 5 "(updateMember|saveMember|getMember)" --type ts

# Look for any workflow or process that might depend on GitHub identity
rg -A 5 "MemberIdentityType.USERNAME.*GITHUB" --type ts

Length of output: 263821

services/apps/premium/members_enrichment_worker/src/activities/enrichment.ts (4)

1-13: LGTM: Clean implementation of credit check in enrichment flow

The addition of credit validation in getEnrichmentData is well-placed, ensuring that enrichment only proceeds when both the source is enrichable and credits are available.

Also applies to: 36-38

61-71: LGTM: Well-implemented Redis caching with TTL

The function properly manages credit status with a reasonable 60-second TTL to prevent stale data.

73-76: Skip comment: Previous review feedback still applies

78-94: Skip comment: Previous optimization suggestion still applies

services/libs/data-access-layer/src/old/apps/premium/members_enrichment_worker/index.ts (4)

19-19: LGTM: Improved pagination with cursor-based approach

The change from simple ID-based pagination to cursor-based pagination with activity count is a good improvement. This ensures consistent pagination when ordering by activity count while maintaining stable results across pages.

21-23: LGTM: Well-structured cursor filter with proper parameterization

The cursor filter correctly implements keyset pagination using both activity count and member ID, with proper SQL injection protection through parameterized queries.

78-89: LGTM: Well-structured joins and ordering

The query structure shows good practices:

Appropriate use of LEFT JOIN for activity summary allows including members without activity
Proper ORDER BY clause ensures consistent pagination
WHERE clause maintains proper filtering logic

91-91: LGTM: Clean parameter handling

The parameter array handling is clean and properly matches the query parameters for both cursor and non-cursor cases.

services/libs/data-access-layer/src/activities/sql.ts (1)

11-11: LGTM!

The type import is correctly placed and properly used in the new function.

coderabbitai · 2024-11-05T09:09:14Z

services/apps/premium/members_enrichment_worker/src/activities.ts

@@ -1,5 +1,6 @@
 import {
  findMemberEnrichmentCache,
+  findMemberIdentityWithTheMostActivityInPlatform,


💡 Codebase verification

Function needs unit tests

The implementation of findMemberIdentityWithTheMostActivityInPlatform was found in services/libs/data-access-layer/src/activities/sql.ts, but no corresponding test files were discovered. Since this is a new backend functionality that executes SQL queries, it should be properly tested.

Add unit tests for findMemberIdentityWithTheMostActivityInPlatform in services/libs/data-access-layer/src/activities/sql.test.ts

Test the SQL query execution with different scenarios

Verify the correct aggregation of activity counts

Check handling of edge cases (no activities, multiple usernames)

🔗 Analysis chain

LGTM! Changes follow established patterns.

The addition of findMemberIdentityWithTheMostActivityInPlatform to imports and exports is clean and maintains consistent ordering.

Let's verify the test coverage for this new functionality:

Also applies to: 54-54

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check if tests exist for the new function # Test 1: Look for test files related to the function fd -g "*test*" -g "*spec*" | rg -l "findMemberIdentityWithTheMostActivityInPlatform" # Test 2: Check the implementation and its tests in data-access-layer rg -A 5 "findMemberIdentityWithTheMostActivityInPlatform" "services/libs/data-access-layer/src" rg -A 5 "findMemberIdentityWithTheMostActivityInPlatform" "services/libs/data-access-layer/test"

Length of output: 977

Script:

#!/bin/bash # Let's try a different approach to find test files # Search for test files in the activities directory fd -g "*test*" -g "*spec*" -p "services/libs/data-access-layer" # Check content of any test files in the activities directory rg -A 5 "activities/sql" "services/libs/data-access-layer" # Look for the function implementation to understand its complexity ast-grep --pattern 'export async function findMemberIdentityWithTheMostActivityInPlatform($_) { $$$ }'

Length of output: 280

Script:

#!/bin/bash # Let's try one final approach to find tests # Search for test files in data-access-layer fd "test" "services/libs/data-access-layer" # Look for SQL implementation details cat services/libs/data-access-layer/src/activities/sql.ts # Check if there are any test files in the activities folder fd "test" "services/libs/data-access-layer/src/activities"

Length of output: 36040

coderabbitai · 2024-11-05T09:09:14Z

services/libs/data-access-layer/src/activities/sql.ts

+export async function findMemberIdentityWithTheMostActivityInPlatform(
+  qdbConn: DbConnOrTx,
+  platform: string,
+  memberId: string,
+): Promise<IEnrichableMemberIdentityActivityAggregate> {
+  const query = `
+  SELECT count(a.id) AS "activityCount", a.platform, a.username
+      FROM activities a
+      WHERE a."memberId" = $(memberId)
+        AND a.platform = $(platform)
+      GROUP BY a.platform, a.username
+      ORDER BY activity_count DESC
+    LIMIT 1;
+  `
+
+  return qdbConn.oneOrNone(query, {
+    memberId,
+    platform,
+  })
+}


⚠️ Potential issue

Fix SQL query issues and add proper type mapping

There are several issues in the implementation:

The ORDER BY clause uses incorrect column name activity_count instead of "activityCount"

Missing soft delete filter that's used in all other queries

Inconsistent column name quoting

Raw database result is returned without mapping to the interface type

Apply this diff to fix the issues:

export async function findMemberIdentityWithTheMostActivityInPlatform( qdbConn: DbConnOrTx, platform: string, memberId: string, ): Promise<IEnrichableMemberIdentityActivityAggregate> { const query = ` - SELECT count(a.id) AS "activityCount", a.platform, a.username + SELECT COUNT(a.id) AS "activityCount", a."platform", a."username" FROM activities a WHERE a."memberId" = $(memberId) AND a.platform = $(platform) + AND a."deletedAt" IS NULL - GROUP BY a.platform, a.username - ORDER BY activity_count DESC + GROUP BY a."platform", a."username" + ORDER BY "activityCount" DESC LIMIT 1; ` - return qdbConn.oneOrNone(query, { + const result = await qdbConn.oneOrNone(query, { memberId, platform, }) + + return result ? { + activityCount: parseInt(result.activityCount), + platform: result.platform, + username: result.username + } : null }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export async function findMemberIdentityWithTheMostActivityInPlatform(

qdbConn: DbConnOrTx,

platform: string,

memberId: string,

): Promise<IEnrichableMemberIdentityActivityAggregate> {

const query = `

SELECT count(a.id) AS "activityCount", a.platform, a.username

FROM activities a

WHERE a."memberId" = $(memberId)

AND a.platform = $(platform)

GROUP BY a.platform, a.username

ORDER BY activity_count DESC

LIMIT 1;

`

return qdbConn.oneOrNone(query, {

memberId,

platform,

})

}

export async function findMemberIdentityWithTheMostActivityInPlatform(

qdbConn: DbConnOrTx,

platform: string,

memberId: string,

): Promise<IEnrichableMemberIdentityActivityAggregate> {

const query = `

SELECT COUNT(a.id) AS "activityCount", a."platform", a."username"

FROM activities a

WHERE a."memberId" = $(memberId)

AND a.platform = $(platform)

AND a."deletedAt" IS NULL

GROUP BY a."platform", a."username"

ORDER BY "activityCount" DESC

LIMIT 1;

`

const result = await qdbConn.oneOrNone(query, {

memberId,

platform,

})

return result ? {

activityCount: parseInt(result.activityCount),

platform: result.platform,

username: result.username

} : null

}

epipav added 4 commits November 4, 2024 09:38

enrich members with 100 activities serp api

842aa4f

more strict

3d7f7b8

support for credit checking, stricter serp api conditions, add activi…

52ecd05

…ty count filter to clearbit

Merge branch 'main' into improvement/stricter-serp-api-enrichment-filter

9d44c1c

epipav marked this pull request as ready for review November 4, 2024 10:12

epipav self-assigned this Nov 4, 2024

epipav added the Improvement Created by Linear-GitHub Sync label Nov 4, 2024

epipav requested a review from sausage-todd November 4, 2024 10:16

coderabbitai bot reviewed Nov 4, 2024

View reviewed changes

epipav added 3 commits November 4, 2024 11:26

handle axios errors when getting remaining credits

66f54f0

Merge branch 'improvement/stricter-serp-api-enrichment-filter' of git…

aa80955

…hub.com:CrowdDotDev/crowd.dev into improvement/stricter-serp-api-enrichment-filter

selecting among multiple gh identities when enriching, prioritizing m…

5e47a45

…embers with most activities

sausage-todd approved these changes Nov 5, 2024

View reviewed changes

epipav added 2 commits November 5, 2024 09:50

use the new afterCursor when scheduling member enrichment

abb976d

Merge branch 'main' into improvement/stricter-serp-api-enrichment-filter

0808d9a

coderabbitai bot reviewed Nov 5, 2024

View reviewed changes

epipav merged commit ca47f2c into main Nov 5, 2024
7 checks passed

epipav deleted the improvement/stricter-serp-api-enrichment-filter branch November 5, 2024 09:12

This was referenced Nov 13, 2024

Admin integration status page #2686

Merged

Member enrichment llm lfx 1712 #2688

Merged

This was referenced Nov 22, 2024

Fix merging of big orgs #2694

Merged

Small fixes for member enrichment #2702

Merged

coderabbitai bot mentioned this pull request Dec 2, 2024

Final tweaks to member enrichment #2704

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improvement/stricter serp api enrichment filter #2670

Improvement/stricter serp api enrichment filter #2670

Uh oh!

epipav commented Nov 4, 2024 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 4, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 4, 2024

Uh oh!

coderabbitai bot Nov 4, 2024

Uh oh!

coderabbitai bot Nov 4, 2024

Uh oh!

coderabbitai bot Nov 4, 2024

Uh oh!

coderabbitai bot Nov 4, 2024

Uh oh!

coderabbitai bot Nov 4, 2024

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 5, 2024

Uh oh!

coderabbitai bot Nov 5, 2024

Uh oh!

Uh oh!

Uh oh!

Improvement/stricter serp api enrichment filter #2670

Improvement/stricter serp api enrichment filter #2670

Uh oh!

Conversation

epipav commented Nov 4, 2024 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes proposed ✍️

What

Why

How

Checklist ✅

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 4, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 4, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 4, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 4, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 4, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 4, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

epipav commented Nov 4, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 4, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)